본문으로 건너뛰기

FDP PoC Stage 4 Summary

Historical note: this file summarizes the earlier warmup=0, measurement=1 Stage 4 smoke/diagnostic runs. The README-aligned formal scenario is now documented in runbook.md and ../reports/2026-06-15_poc_harness_evaluation_scenario.md: TRACE_SCALE=stress, WARMUP_ITERATIONS=2, MEASUREMENT_ITERATIONS=8, mode order no_fdp -> mixed -> separated, and report.md linking each output summary.md.

Current status

  • Target: /dev/nvme0n1 block namespace, /dev/ng0n1 raw generic namespace, controller /dev/nvme0, endgrp 1.
  • Stage 3 direct FDP sanity passed: write_uring ... dspec=0, readback matched.
  • Stage 4 3-mode stress comparison is now valid under one common setup.
  • Output root: /home/ny/fdp_measure_3mode_final
  • Parsed summary was regenerated from parse_results.py because the first redirected results_summary.csv was empty.

Final 3-mode result

Parsed CSV:

run_dir,mode,host_write_bytes_delta,fdp_hbmw_delta,fdp_mbmw_delta,fdp_mbe_delta,write_p99_ms,all_p99_ms
R_mixed_20260615T084519,mixed,43859968000,43860094976,44577980416,39862665216,1.2903,1.2211
R_no_fdp_20260615T083105,no_fdp,385622528000,385622347776,399353851904,418557984768,1.3531,1.3495
R_separated_20260615T083813,separated,198493184000,198492884992,202153451520,199313326080,3.3568,3.2733

Per-mode summary:

ModeRun dirWorkersRecordsFailureshost write deltaFDP HBMW deltawrite p99 msall p99 msWAF status
no_fdpR_no_fdp_20260615T083105927,2400385,622,528,000385,622,347,7761.3531.349unavailable: media/NAND write counter missing
separatedR_separated_20260615T083813927,2400198,493,184,000198,492,884,9923.3573.273unavailable: media/NAND write counter missing
mixedR_mixed_20260615T084519927,240043,859,968,00043,860,094,9761.2901.221unavailable: media/NAND write counter missing

Interpretation:

  • This run is the first valid same-condition 3-mode comparison in this PoC stage.
  • separated is no longer stuck at host_write_bytes_delta=0; the earlier 0 result was caused by mode-specific raw-device window shift.
  • In this workload and measurement setup, host writes rank as no_fdp > separated > mixed.
  • The original hypothesis "separated should be lowest" is not supported by this run.
  • WAF itself is still unavailable because there is no vendor media/NAND write counter.

Counter consistency

Internal measurement and outer SMART/FDP snapshots agree.

ModeInternal host deltaOuter SMART deltaOuter FDP HBMW delta
no_fdp385,622,528,000385,622,528,000385,622,347,776
separated198,493,184,000198,493,184,000198,492,884,992
mixed43,859,968,00043,859,968,00043,860,094,976

This means:

  • harness SMART parsing is working,
  • after-measurement settle timing is sufficient,
  • and the current result set is usable for comparison.

Next step

This is still a single valid comparison set. The next required step is repeated trials under the same condition to measure run-to-run variance.

Validated mode mapping

ModeFDPRUH policy
no_fdpoffno FDP placement hints
separatedonhot/small/random data [0,1] metadata [2]; cold/rag/large data [3,4] metadata [5]; metadata-heavy data [6] metadata [7]
mixedonall workers share data [0,1,3,4,6,7], metadata [2,5]

Next command: actual smoke without R0

This does not run blkdiscard, but it still writes to the raw NVMe namespace. Run it only on the confirmed FDP test SSD.

cd /home/ny/work/fdp/04_measure

for mode in no_fdp mixed separated; do
sudo env \
HOME=/home/ny \
REPO=/home/ny/work/LMCache-fdp-poc-src \
NVME_DEV=/dev/nvme0 \
NVME_NS=/dev/nvme0n1 \
NVME_IO_DEV=/dev/ng0n1 \
FDP_ENDGRP_ID=1 \
FDP_RUH_COUNT=8 \
TRACE_SCALE=smoke \
WARMUP_ITERATIONS=0 \
MEASUREMENT_ITERATIONS=1 \
OUT=/home/ny/fdp_measure_retry \
bash run_poc_harness.sh "$mode"
done

/usr/bin/python3 parse_results.py /home/ny/fdp_measure_retry > /home/ny/fdp_measure_retry/results_summary.csv
/usr/bin/python3 write_report.py /home/ny/fdp_measure_retry > /home/ny/fdp_measure_retry/report.md

Optional command: R0-normalized comparison

This runs blkdiscard /dev/nvme0n1 before each mode and is destructive.

cd /home/ny/work/fdp/04_measure

for mode in no_fdp mixed separated; do
sudo env \
HOME=/home/ny \
NVME_DEV=/dev/nvme0 \
NVME_NS=/dev/nvme0n1 \
FDP_ENDGRP_ID=1 \
OUT=/home/ny/fdp_measure_retry \
PRECONDITION_SIZE_GB=16 \
bash run_R0_precondition.sh

sudo env \
HOME=/home/ny \
REPO=/home/ny/work/LMCache-fdp-poc-src \
NVME_DEV=/dev/nvme0 \
NVME_NS=/dev/nvme0n1 \
NVME_IO_DEV=/dev/ng0n1 \
FDP_ENDGRP_ID=1 \
FDP_RUH_COUNT=8 \
TRACE_SCALE=smoke \
WARMUP_ITERATIONS=0 \
MEASUREMENT_ITERATIONS=1 \
OUT=/home/ny/fdp_measure_retry \
bash run_poc_harness.sh "$mode"
done

/usr/bin/python3 parse_results.py /home/ny/fdp_measure_retry > /home/ny/fdp_measure_retry/results_summary.csv
/usr/bin/python3 write_report.py /home/ny/fdp_measure_retry > /home/ny/fdp_measure_retry/report.md

Interpretation

The current runs prove more than harness execution: they provide a valid same-condition 3-mode comparison. The next work item is not rerunning the same baseline, but explaining why mixed produced the lowest host-write delta and whether that reflects trace shape, RUH policy, or an artifact of this PoC harness.