본문으로 건너뛰기

FDP WAF Stress Result Summary: R0 Window-Local 32GiB 3-Mode Batch

작성일: 2026-06-15 기준 시나리오: fdp/reports/2026-06-15_poc_harness_evaluation_scenario.md Run output: /home/ny/fdp_measure_waf_r0_window32g_readme8

Overall Result

PASS. no_fdp, mixed, separated 3모드 모두 R0 window-local 32GiB precondition 후 README 기준 stress, warmup=2, measurement=8 조건으로 완료됐고, 세 모드 모두 WAF가 available로 산출됐다.

MetricValue
Result statusPASS
Mode(s)no_fdp, mixed, separated
Run IDR_no_fdp_20260615T222219, R_mixed_20260615T223842, R_separated_20260615T225510
Device path/dev/ng0n1
Block device path/dev/nvme0n1
Workers9 per mode
Warmup iterations2
Measurement iterations8
Total replay processes90 per mode
Process failures0
Replay record failures0
Operation errors0
Total wall timeno_fdp 599.24s, mixed 605.94s, separated 597.14s
Warmup wall timeno_fdp 118.13s, mixed 118.68s, separated 117.74s
Measurement wall timeno_fdp 481.10s, mixed 487.25s, separated 479.39s

Precondition

원문 PoC 기준의 clean-state WAF artifact를 피하기 위해 각 mode 직전에 window-local R0 precondition을 수행했다.

MetricValue
Precondition statusused
Precondition typewindow-local
R0 output/home/ny/fdp_measure_waf_r0_window32g_readme8/R0_precondition_20260615T222201, /home/ny/fdp_measure_waf_r0_window32g_readme8/R0_precondition_20260615T223825, /home/ny/fdp_measure_waf_r0_window32g_readme8/R0_precondition_20260615T225452
Namespace discardyes
Offset bytes2199023255552
Size32GiB
Sequential fillyes
Random overwriteyes
Random passes1
Vendor media counter during R0collected

WAF Result

ModeHost write bytes deltaHost write deltaMedia write bytes deltaMedia write deltaWAFWrite amplification over host writesWAF status
no_fdp511,058,432,000475.96GiB519,170,949,120483.52GiB1.0158741.587%available
mixed500,674,560,000466.29GiB508,619,390,976473.69GiB1.0158681.587%available
separated373,750,784,000348.08GiB379,687,010,304353.61GiB1.0158831.588%available

separated는 host/media write volume이 가장 낮았다. 다만 WAF ratio 자체는 세 모드가 모두 약 1.0159로 거의 같아서, 이번 batch만으로 "separated가 WAF ratio를 낮췄다"는 결론은 약하다.

Expected vs Observed

원문 PoC와 README의 기대 방향은 lifetime이 비슷한 KV cache stream을 분리하면 SSD 내부 GC의 불필요한 copy가 줄고, 그 결과 WAF와 NAND wear가 감소하며 p99 tail latency가 안정화된다는 것이다. README 기준으로는 mixed가 의도적으로 나쁜 FDP placement이고, separated가 lifetime/RUH 분리 효과를 확인하는 모드다.

기대 항목기대 방향이번 관측값판단
WAF ratioseparatedno_fdp/mixed보다 낮아지는 방향no_fdp 1.015874, mixed 1.015868, separated 1.015883미입증. 세 모드 WAF ratio가 사실상 동일
Host write pressureseparated가 baseline 대비 감소separatedno_fdp 대비 -26.9%, mixed 대비 -25.4%긍정 신호
Media write pressureseparated가 media write를 줄이는 방향separated 353.61GiB, no_fdp 483.52GiB, mixed 473.69GiB긍정 신호
Tail latencyseparated에서 p99가 안정화되는 방향write p99: separated 2.398ms, no_fdp 2.732ms, mixed 3.337ms긍정 신호
Clean-state caveatWAF 평가 전에 preconditioning 적용각 mode 전 32GiB window-local R0 수행충족

따라서 이번 batch는 원문/README의 기대 중 write pressure 감소tail latency 개선에는 맞는 신호를 보였다. 반면 핵심 지표인 WAF ratio 자체의 감소는 이번 결과에서 확인되지 않았다. 이 결과는 "separated가 WAF를 낮췄다"가 아니라, "separated가 이번 workload에서 host/media write volume과 p99를 낮췄지만 WAF ratio는 동일 수준이었다"로 해석해야 한다.

Counter Collection

CounterSourceStatus
Host writesnvme smart-log /dev/nvme0n1 -o jsoncollected
Media writesnvme ocp smart-add-log /dev/nvme0n1 -o jsoncollected
FDP statsouter nvme fdp stats /dev/nvme0 --endgrp-id=1 snapshotscollected
FDP RUH logsxNVMe FDP log commandsnot collected

Notes:

  • measurement_after.json warnings contain xnvme unavailable; skipped FDP logs.
  • Formal WAF used vendor OCP Physical media units written, not FDP MBMW proxy.

Measurement Workload Pressure

The table below uses the separated mode workload mapping. Measurement-only logical store writes are shown for measurement_iterations=8.

WorkloadWorkersClassFDP Data RUHsFDP Metadata RUHsLogical Store WritesStore recordsPrefetch records
llama8b_chat_chunk2562hot_churn0,12160GiB40,9605,120
llama70b_longctx_chunk10241large_model3,4580GiB2,5602,512
rag_shared_prefix_chunk5122cold_rag3,45192GiB12,28848,864
random_prompts_chunk1283hot_churn0,12192GiB98,3041,536
metadata_heavy_small_objects1metadata_heavy6764GiB65,5360
Total9688GiB219,64858,032

Latency

Modewrite p50 mswrite p99 msread p50 msread p99 msall p99 ms
no_fdp0.0832.732N/A0.3192.672
mixed0.0843.337N/A0.3083.234
separated0.0792.398N/A0.3032.338

separated가 이번 batch에서 write p99와 all p99 모두 가장 낮았다.

Log Observations

Replay completed cleanly. The warning categories that appeared in the earlier probe runs did not recur in this batch.

Warning CategoryCountFiles
Cleaning up in-flight store task00
Failed batched memory allocation00
Device metadata slot_bytes mismatch00
FDP metadata payload rejected00
io_uring fixed-buffer registration disabled00
Tracebacks00
ERROR logs00
Exceptions00

Validity Against Scenario

CriterionRequired by scenarioObservedStatus
Original goal: lifetime separationseparated RUH mapping separates hot/cold/metadatahot/small/random data 0,1, cold/large/rag data 3,4, metadata 2/5/7PASS
Original goal: WAF availablehost and media counters collectedWAF available in all 3 modesPASS
Original goal: clean-state caveat addressedR0 precondition status documented32GiB window-local R0 before each modePASS
README scalestressstressPASS
README warmup22PASS
README measurement88PASS
Process failures00PASS
Replay record failures00PASS
Host write delta> 0all modes > 0PASS
Media write delta> 0all modes > 0PASS
p99 latency availablewrite p99 at minimumwrite/all p99 available; read p99 parsed from read-like recordsPASS

Conclusion

This batch confirms that the PoC harness can run the README-style stress, warmup=2, measurement=8 workload under the original-document requirement of preconditioning before WAF evaluation. All three modes completed without replay failures, and formal WAF was available in all modes.

The main positive signal is that separated reduced host write volume by 26.9% versus no_fdp and 25.4% versus mixed, while also showing the lowest write p99. The WAF ratio itself was nearly identical across all modes at about 1.0159, so this run supports reduced write pressure and tail latency for separated, but not a strong WAF-ratio improvement claim. The next step is either repeated trials under the same 32GiB R0 condition or an expanded R0 span of 256GiB/512GiB to test whether the result is stable.