FDP WAF Stress Result Summary: R0 Window-Local 32GiB 3-Mode Batch
작성일: 2026-06-15
기준 시나리오: fdp/reports/2026-06-15_poc_harness_evaluation_scenario.md
Run output: /home/ny/fdp_measure_waf_r0_window32g_readme8
Overall Result
PASS. no_fdp, mixed, separated 3모드 모두 R0 window-local 32GiB precondition 후 README 기준 stress, warmup=2, measurement=8 조건으로 완료됐고, 세 모드 모두 WAF가 available로 산출됐다.
| Metric | Value |
|---|---|
| Result status | PASS |
| Mode(s) | no_fdp, mixed, separated |
| Run ID | R_no_fdp_20260615T222219, R_mixed_20260615T223842, R_separated_20260615T225510 |
| Device path | /dev/ng0n1 |
| Block device path | /dev/nvme0n1 |
| Workers | 9 per mode |
| Warmup iterations | 2 |
| Measurement iterations | 8 |
| Total replay processes | 90 per mode |
| Process failures | 0 |
| Replay record failures | 0 |
| Operation errors | 0 |
| Total wall time | no_fdp 599.24s, mixed 605.94s, separated 597.14s |
| Warmup wall time | no_fdp 118.13s, mixed 118.68s, separated 117.74s |
| Measurement wall time | no_fdp 481.10s, mixed 487.25s, separated 479.39s |
Precondition
원문 PoC 기준의 clean-state WAF artifact를 피하기 위해 각 mode 직전에 window-local R0 precondition을 수행했다.
| Metric | Value |
|---|---|
| Precondition status | used |
| Precondition type | window-local |
| R0 output | /home/ny/fdp_measure_waf_r0_window32g_readme8/R0_precondition_20260615T222201, /home/ny/fdp_measure_waf_r0_window32g_readme8/R0_precondition_20260615T223825, /home/ny/fdp_measure_waf_r0_window32g_readme8/R0_precondition_20260615T225452 |
| Namespace discard | yes |
| Offset bytes | 2199023255552 |
| Size | 32GiB |
| Sequential fill | yes |
| Random overwrite | yes |
| Random passes | 1 |
| Vendor media counter during R0 | collected |
WAF Result
| Mode | Host write bytes delta | Host write delta | Media write bytes delta | Media write delta | WAF | Write amplification over host writes | WAF status |
|---|---|---|---|---|---|---|---|
no_fdp | 511,058,432,000 | 475.96GiB | 519,170,949,120 | 483.52GiB | 1.015874 | 1.587% | available |
mixed | 500,674,560,000 | 466.29GiB | 508,619,390,976 | 473.69GiB | 1.015868 | 1.587% | available |
separated | 373,750,784,000 | 348.08GiB | 379,687,010,304 | 353.61GiB | 1.015883 | 1.588% | available |
separated는 host/media write volume이 가장 낮았다. 다만 WAF ratio 자체는 세 모드가 모두 약 1.0159로 거의 같아서, 이번 batch만으로 "separated가 WAF ratio를 낮췄다"는 결론은 약하다.
Expected vs Observed
원문 PoC와 README의 기대 방향은 lifetime이 비슷한 KV cache stream을 분리하면 SSD 내부 GC의 불필요한 copy가 줄고, 그 결과 WAF와 NAND wear가 감소하며 p99 tail latency가 안정화된다는 것이다. README 기준으로는 mixed가 의도적으로 나쁜 FDP placement이고, separated가 lifetime/RUH 분리 효과를 확인하는 모드다.
| 기대 항목 | 기대 방향 | 이번 관측값 | 판단 |
|---|---|---|---|
| WAF ratio | separated가 no_fdp/mixed보다 낮아지는 방향 | no_fdp 1.015874, mixed 1.015868, separated 1.015883 | 미입증. 세 모드 WAF ratio가 사실상 동일 |
| Host write pressure | separated가 baseline 대비 감소 | separated는 no_fdp 대비 -26.9%, mixed 대비 -25.4% | 긍정 신호 |
| Media write pressure | separated가 media write를 줄이는 방향 | separated 353.61GiB, no_fdp 483.52GiB, mixed 473.69GiB | 긍정 신호 |
| Tail latency | separated에서 p99가 안정화되는 방향 | write p99: separated 2.398ms, no_fdp 2.732ms, mixed 3.337ms | 긍정 신호 |
| Clean-state caveat | WAF 평가 전에 preconditioning 적용 | 각 mode 전 32GiB window-local R0 수행 | 충족 |
따라서 이번 batch는 원문/README의 기대 중 write pressure 감소와 tail latency 개선에는 맞는 신호를 보였다. 반면 핵심 지표인 WAF ratio 자체의 감소는 이번 결과에서 확인되지 않았다. 이 결과는 "separated가 WAF를 낮췄다"가 아니라, "separated가 이번 workload에서 host/media write volume과 p99를 낮췄지만 WAF ratio는 동일 수준이었다"로 해석해야 한다.
Counter Collection
| Counter | Source | Status |
|---|---|---|
| Host writes | nvme smart-log /dev/nvme0n1 -o json | collected |
| Media writes | nvme ocp smart-add-log /dev/nvme0n1 -o json | collected |
| FDP stats | outer nvme fdp stats /dev/nvme0 --endgrp-id=1 snapshots | collected |
| FDP RUH logs | xNVMe FDP log commands | not collected |
Notes:
measurement_after.jsonwarnings containxnvme unavailable; skipped FDP logs.- Formal WAF used vendor OCP
Physical media units written, not FDP MBMW proxy.
Measurement Workload Pressure
The table below uses the separated mode workload mapping. Measurement-only logical store writes are shown for measurement_iterations=8.
| Workload | Workers | Class | FDP Data RUHs | FDP Metadata RUHs | Logical Store Writes | Store records | Prefetch records |
|---|---|---|---|---|---|---|---|
llama8b_chat_chunk256 | 2 | hot_churn | 0,1 | 2 | 160GiB | 40,960 | 5,120 |
llama70b_longctx_chunk1024 | 1 | large_model | 3,4 | 5 | 80GiB | 2,560 | 2,512 |
rag_shared_prefix_chunk512 | 2 | cold_rag | 3,4 | 5 | 192GiB | 12,288 | 48,864 |
random_prompts_chunk128 | 3 | hot_churn | 0,1 | 2 | 192GiB | 98,304 | 1,536 |
metadata_heavy_small_objects | 1 | metadata_heavy | 6 | 7 | 64GiB | 65,536 | 0 |
| Total | 9 | 688GiB | 219,648 | 58,032 |
Latency
| Mode | write p50 ms | write p99 ms | read p50 ms | read p99 ms | all p99 ms |
|---|---|---|---|---|---|
no_fdp | 0.083 | 2.732 | N/A | 0.319 | 2.672 |
mixed | 0.084 | 3.337 | N/A | 0.308 | 3.234 |
separated | 0.079 | 2.398 | N/A | 0.303 | 2.338 |
separated가 이번 batch에서 write p99와 all p99 모두 가장 낮았다.
Log Observations
Replay completed cleanly. The warning categories that appeared in the earlier probe runs did not recur in this batch.
| Warning Category | Count | Files |
|---|---|---|
| Cleaning up in-flight store task | 0 | 0 |
| Failed batched memory allocation | 0 | 0 |
| Device metadata slot_bytes mismatch | 0 | 0 |
| FDP metadata payload rejected | 0 | 0 |
| io_uring fixed-buffer registration disabled | 0 | 0 |
| Tracebacks | 0 | 0 |
| ERROR logs | 0 | 0 |
| Exceptions | 0 | 0 |
Validity Against Scenario
| Criterion | Required by scenario | Observed | Status |
|---|---|---|---|
| Original goal: lifetime separation | separated RUH mapping separates hot/cold/metadata | hot/small/random data 0,1, cold/large/rag data 3,4, metadata 2/5/7 | PASS |
| Original goal: WAF available | host and media counters collected | WAF available in all 3 modes | PASS |
| Original goal: clean-state caveat addressed | R0 precondition status documented | 32GiB window-local R0 before each mode | PASS |
| README scale | stress | stress | PASS |
| README warmup | 2 | 2 | PASS |
| README measurement | 8 | 8 | PASS |
| Process failures | 0 | 0 | PASS |
| Replay record failures | 0 | 0 | PASS |
| Host write delta | > 0 | all modes > 0 | PASS |
| Media write delta | > 0 | all modes > 0 | PASS |
| p99 latency available | write p99 at minimum | write/all p99 available; read p99 parsed from read-like records | PASS |
Conclusion
This batch confirms that the PoC harness can run the README-style stress, warmup=2, measurement=8 workload under the original-document requirement of preconditioning before WAF evaluation. All three modes completed without replay failures, and formal WAF was available in all modes.
The main positive signal is that separated reduced host write volume by 26.9% versus no_fdp and 25.4% versus mixed, while also showing the lowest write p99. The WAF ratio itself was nearly identical across all modes at about 1.0159, so this run supports reduced write pressure and tail latency for separated, but not a strong WAF-ratio improvement claim. The next step is either repeated trials under the same 32GiB R0 condition or an expanded R0 span of 256GiB/512GiB to test whether the result is stable.