GWP-177 Design Pack — kn86fw bootfs/rootfs producer extension
Story narrative
Section titled “Story narrative”Today, two artifacts come out of the system-image pipeline:
tools/sd-provision/build.shinvokes pi-gen and copies a single.img(the full six-partition A/B SD layout per ADR-0011) totools/sd-provision/build/kn86-os-vDEV.img. That.imgis what an operator flashes onto a fresh microSD with rpi-imager (or what the desktop flasher will write whole during initial provisioning).tools/kn86fw/wraps an opaque payload file with the 128-byte.kn86fwheader (magic, format version, semver fields, payload SHA-256, min-bootloader-version) — the on-the-wire update package consumed by the desktop flasher and the Pi-side verifier. Phase 0 explicitly treats--inputas opaque bytes.
What is missing — and what update-system.md and ADR-0011 already commit to — is the slot artifact: the bootfs partition image and the rootfs partition image, emitted as separate, verifiable files so they can (a) be wrapped by kn86fw build into a .kn86fw for the field-update flow, (b) be flashed to the inactive A/B slot by kn86flash (Wave 2 Tauri) without disturbing the user’s /home/shared (p6) or the active slot, and (c) feed the cartridge-MSC SD-card pipeline (where the device exposes the inactive slot to the host as USB-MSC, and the host writes a per-partition image straight in).
The current kn86fw README’s “Phase 0 scope” callout names this work explicitly: “A future Wave 2 tool will assemble bootfs + rootfs slot images from a Pi OS Lite build and hand the resulting blob to kn86fw build for packaging.” GWP-177 is that work. This design pack picks the producer’s shape, locks the file formats, and inventories the surface area changes inside tools/kn86fw/ — without writing the producer.
The decision to defer implementation past v0.1 is correct: pi-gen is producing the full .img, the .kn86fw wrapper is shipped, the SD layout is canonical in ADR-0011, and the flasher’s whole-image path covers Stage 0 bring-up. Per-partition production is what makes field updates affordable (you don’t reflash the full SD; you write 256 MB bootfs + ~2 GB rootfs to the inactive slot only). That problem only becomes urgent after v0.1 ships and the device starts accumulating user state on p6 that re-imaging would destroy.
Producer shape (the load-bearing decision)
Section titled “Producer shape (the load-bearing decision)”There are three plausible shapes; this design pack picks option C with option B as the immediate v0.1 fallback:
| Option | What | Pros | Cons |
|---|---|---|---|
| A. Full producer (kn86fw replaces pi-gen) | kn86fw produce-bootfs and kn86fw produce-rootfs build the partition contents from a stage manifest in pure Rust — equivalent to a from-scratch debootstrap + Pi firmware bundling. | Single-tool story; no shell + Docker dependency. | Re-implements pi-gen. ~weeks of work + maintenance burden. The CLAUDE.md “Platform Engineering” surface explicitly leans on pi-gen for vendor-firmware bundling and tryboot support. Reject. |
B. Wrapper (kn86fw build calls pi-gen under the hood) | kn86fw build-image shells out to tools/sd-provision/build.sh, then post-processes. | Convenient for callers; one CLI surface. | Couples a Rust CLI to a bash + Docker pipeline; CI ergonomics are worse not better; obscures the decoupled nature of the two steps. Reject as primary. |
C. Post-processor (kn86fw splits the existing .img) | kn86fw split <image.img> reads the GPT/MBR partition table from a pi-gen .img, extracts the bootfs (p2) and rootfs (p4) partition byte ranges, and writes them as separate artifacts (raw partition image + manifest). Adds kn86fw produce-update-bundle that wraps a (bootfs, rootfs) pair into the existing .kn86fw payload format. | Clean separation: pi-gen owns image construction, kn86fw owns image consumption and packaging. Reuses the existing crate’s strengths (header writing, SHA-256, deterministic output). Backwards-compatible: tools/sd-provision/build.sh keeps working unchanged; the new subcommands are additive. | Requires a tiny GPT/MBR parser in Rust (one dependency: gpt = "3" or hand-rolled — the layout is fixed by ADR-0011 so a hand-rolled fixed-offset reader is also viable). |
Decision: option C. Producer shape is post-processor, not a wrapper and not a re-implementation. The .img from tools/sd-provision/build.sh is the input; per-partition artifacts and the .kn86fw update bundle are the outputs. This preserves Spec Hygiene Rule 1 (single source of truth for the partition layout — ADR-0011 — is consumed by both the producer of the .img and the splitter).
The wrapper convenience (option B) lands as a thin bash one-liner in tools/sd-provision/build.sh that invokes the new kn86fw split after the .img copy step — not as Rust code calling out to bash. That keeps the dependency direction clean.
Subcommand surface (concrete CLI shape)
Section titled “Subcommand surface (concrete CLI shape)”Three new subcommands extend the existing build / inspect / verify set:
# 1. Split an existing .img into per-partition raw artifacts + a manifest.kn86fw split \ --input build/kn86-os-vDEV.img \ --output build/slot-artifacts/ \ [--partitions bootfs,rootfs] # default: bootfs+rootfs only; never p1/p6
# Output layout (deterministic file names, sorted manifest):# build/slot-artifacts/# ├── bootfs.img (raw 256 MB FAT32 image — exact p2 byte range)# ├── bootfs.sha256# ├── rootfs.img (raw ext4 image — exact p4 byte range)# ├── rootfs.sha256# └── manifest.toml (source .img path, source SHA-256, partition table snapshot,# build timestamp, kn86fw version, schema_version)
# 2. Produce a .kn86fw update bundle from a (bootfs, rootfs) pair.kn86fw produce-update-bundle \ --bootfs build/slot-artifacts/bootfs.img \ --rootfs build/slot-artifacts/rootfs.img \ --version 0.2.0 \ --nosh-version 0.2.0 \ --output build/kn86-v0.2.0.kn86fw
# 3. Inspect a slot-artifacts directory (parity with `kn86fw inspect` for .kn86fw files).kn86fw inspect-slot build/slot-artifacts/# Prints partition sizes, SHA-256s, source .img origin, manifest schema version.build keeps its existing single-file payload contract verbatim (no breaking change to the Phase 0 surface). produce-update-bundle is a higher-level convenience that internally concatenates bootfs.img + rootfs.img deterministically (header + length-prefixed sections) and calls the existing cmd_build::run — it does not bypass the existing header writer. This means every .kn86fw produced by either path passes the existing verify flow unchanged.
Update-bundle inner format (the payload that produce-update-bundle hands to cmd_build::run)
Section titled “Update-bundle inner format (the payload that produce-update-bundle hands to cmd_build::run)”offset size field------ ------------------ -----0x00 8 bundle_magic[8] = "KN86SLOT"0x08 2 bundle_version = uint16_t LE = 10x0A 2 _reserved = 0x00000x0C 4 section_count = uint32_t LE = 2 (bootfs, rootfs)0x10 16 section_table[0] = { kind: u32, offset: u64, length: u64, sha256: [u8; 32] }... no wait — sized belowSection table entries are 56 bytes each (u32 kind + 4 pad + u64 offset + u64 length + [u8; 32] sha256); the inner header is 16 + 56 × section_count bytes. Section payload bytes follow contiguously, padded to 4 KiB alignment per section. The exact byte layout for this inner format is an open question (see #3 below) — the design pack commits to having an inner format with a magic, a section table, and per-section SHA-256s, but the precise field widths are decided when the producer is implemented (alongside the matching C header at tools/kn86fw/format/kn86slot.h, mirroring the kn86fw.h discipline).
The outer .kn86fw header is unchanged. payload_sha256 covers the entire KN86SLOT blob (header + sections), exactly as for any other payload.
Output formats
Section titled “Output formats”| Surface | Format | Why |
|---|---|---|
bootfs.img | Raw FAT32 partition image (exact p2 byte range from the source .img). | Mountable on a host with mount -o loop. The Pi-side flash path is dd if=bootfs.img of=/dev/mmcblk0p2 bs=4M conv=fsync — no additional unwrap step. Matches what the kexec’d updater does today (per ADR-0011). |
rootfs.img | Raw ext4 partition image. | Same rationale: directly dd-able to the inactive rootfs slot. |
*.sha256 | One-line sha256sum-format file (<hex> <basename>). | Trivially verifiable with stock sha256sum -c; no new tooling on the host side. |
manifest.toml | TOML, sorted keys, no comments. | Human-inspectable, machine-parseable, Cargo-native. Carries the schema_version so future format bumps are detectable. |
.kn86fw update bundle | Existing 128-byte header + KN86SLOT inner blob. | Reuses the entire existing parse / verify / inspect surface; no second wire format. |
Tarballs / ext4 dumps / OTA-specific bundle formats are explicitly NOT chosen. Tarballs require a tar parser on the device side (we don’t have one in the kexec’d updater); ext4 dumps (e2image) have nondeterministic mtimes and aren’t byte-identical run-to-run; OTA bundle formats (Mender, RAUC) are deferred to production per ADR-0011 §Risks #7. The raw-partition + SHA-256 + TOML manifest combination is the lowest-complexity surface that satisfies every consumer.
Use cases (what the producer enables)
Section titled “Use cases (what the producer enables)”- A/B field update via
kn86flash. Operator plugs in cable, the Tauri flasher fetches a.kn86fwfrom a release URL, callskn86fw verify(already implemented), thenkn86flashwrites the unwrappedbootfs.imgto the inactive bootfs partition androotfs.imgto the inactive rootfs partition via the elevated helper./home/shared(p6) is never touched. This is the load-bearing field flow ADR-0011 commits to. - Dev iteration: rebuild rootfs only. A change touching
stage-kn86-runtime(a systemd unit, a nOSh binary refresh) only invalidates the rootfs partition.kn86fw splitproduces a freshrootfs.img; the developer flashes only that to slot B withdd, reboots into B withtryboot, validates, and either commits or reverts viatrybootrollback. Bootfs partition stays untouched. Iteration cost: ~30 s flash vs. ~5 min full-image flash. - CI artifact splitting. The release CI (
.github/workflows/system-image-build.yml) runstools/sd-provision/build.sh, thenkn86fw split, thenkn86fw produce-update-bundle, and uploads three release assets per tag: the full.img, thebootfs.img+rootfs.img+manifest.tomltriplet, and the.kn86fw. This matches the existing CI contract fromsystem-image-build.md(“uploads the.imgand.kn86fwas release assets”) and adds the slot-artifact triplet alongside. - Cartridge-MSC SD-card pipeline. The cartridge-MSC bridge (ADR-0019) doesn’t directly consume slot artifacts, but the same producer is the natural source for the bootstrapping pipeline that prepares a blank cartridge SD via the same
ddflow. Out of scope for this pack — flagged for cross-reference.
Reproducibility
Section titled “Reproducibility”The producer must emit byte-identical output for byte-identical input. Concretely:
bootfs.imgandrootfs.imgare byte slices of the source.img. The bytes are deterministic by construction (slice of an immutable input). The only nondeterminism risk is the source.imgitself, whichsystem-image-build.mdnotes is reproducible modulo a/etc/kn86-build-idtimestamp; that’s out of scope here.*.sha256issha2-computed over the partition bytes — deterministic.manifest.tomlmust use sorted keys, ASCII-only values, no inline comments, and either zero timestamps or anSOURCE_DATE_EPOCH-controlled timestamp (per the existing reproducible-builds convention). Recommend: emit abuilt_atfield only whenSOURCE_DATE_EPOCHis set; otherwise omit it..kn86fwupdate bundle is deterministic if the innerKN86SLOTblob is deterministic. The inner blob’s section ordering is fixed (bootfsfirst,rootfssecond); the section padding is zero-fill. No source-date-dependent fields go into the inner header.- Verification path: a CI step does two builds and asserts byte-identical output of every artifact. This is the primary reproducibility gate; it fails loud if any future code change introduces nondeterminism.
Backwards compatibility
Section titled “Backwards compatibility”The current tools/sd-provision/build.sh flow is unchanged. After the producer ships, callers that only want the .img keep getting the .img. The new subcommands are additive:
tools/sd-provision/build.shkeeps copying the.imgtoOUTPUT_IMGexactly as today; an optional post-step (gated byKN86_PRODUCE_SLOT_ARTIFACTS=1) invokeskn86fw splitandkn86fw produce-update-bundle.kn86fw build(Phase 0 single-file payload contract) is unchanged. Its tests stay green.- The
.kn86fwouter header (format/kn86fw.h) is unchanged. No format version bump. - Adding a
KN86SLOTinner format adds a new C header (tools/kn86fw/format/kn86slot.h) — same single-source-of-truth discipline askn86fw.h.
The only file that changes shape is the README: a new “Wave 2 — slot artifacts” section is added explaining the new subcommands, and the existing “Phase 0 scope” callout is updated to reflect that the slot-producer half has shipped.
Cargo crate structure
Section titled “Cargo crate structure”Existing tools/kn86fw/src/ modules:
src/├── main.rs (clap dispatch; +3 subcommands: split, produce-update-bundle, inspect-slot)├── lib.rs (re-exports; +pub mod split, slot)├── header.rs (UNCHANGED — outer .kn86fw header)├── cmd_build.rs (UNCHANGED — Phase 0 single-file payload path)├── inspect.rs (UNCHANGED — extends only via inspect_slot.rs)└── verify.rs (UNCHANGED — payload SHA-256 verification)New modules:
src/├── partition_table.rs (NEW — minimal MBR/GPT partition-table reader; takes a Read+Seek,│ returns a Vec<Partition { number, start_lba, length_lba, type_code }>.│ Hand-rolled, no dep — ADR-0011 layout is fixed.)├── split.rs (NEW — orchestrates partition-table read → byte slice → write to disk +│ SHA-256 + manifest. Uses partition_table.rs.)├── slot.rs (NEW — KN86SLOT inner-format encoder/decoder. Mirrored in│ format/kn86slot.h with C-side struct + _Static_assert parity tests.)├── cmd_split.rs (NEW — `kn86fw split` subcommand impl; depends on split.rs.)├── cmd_produce_bundle.rs (NEW — `kn86fw produce-update-bundle` impl; depends on slot.rs +│ cmd_build::run for the outer-header step.)└── inspect_slot.rs (NEW — `kn86fw inspect-slot` impl; pretty-prints manifest + partition stats.)New format/ files:
format/├── kn86fw.h (UNCHANGED)└── kn86slot.h (NEW — KN86SLOT inner-format C header. Same packed-struct + _Static_assert discipline as kn86fw.h. Sized to exactly N bytes per the locked spec.)New tests/:
tests/├── integration.rs (UNCHANGED — Phase 0 tests stay green)├── integration_split.rs (NEW — split a known-good fixture .img, assert per-partition SHA-256,│ assert manifest.toml stable bytes.)├── integration_bundle.rs (NEW — produce + verify roundtrip; assert outer .kn86fw verify passes;│ assert inner KN86SLOT round-trips through slot.rs.)└── integration_repro.rs (NEW — build twice from the same fixture, diff every byte.)New dev-dependencies: none required if partition_table.rs is hand-rolled. (If we adopt the gpt crate, that’s one new dep; the alternative — hand-rolled — is preferred for vendoring discipline and to keep the dependency surface small.)
Acceptance criteria (when the implementation actually runs, post-v0.1)
Section titled “Acceptance criteria (when the implementation actually runs, post-v0.1)”kn86fw split <kn86-os-vDEV.img>emits the artifact triplet in the documented layout. Each*.imgSHA-256 matches the corresponding*.sha256file.kn86fw produce-update-bundle --bootfs ... --rootfs ... ...emits a.kn86fwthat passes the existingkn86fw verifyflow without modification.kn86fw inspect-slot <dir>prints partition sizes, SHA-256s, source.imgorigin, andschema_version.- Reproducibility test (
integration_repro.rs) passes — twosplit + produce-update-bundleruns against the same input produce byte-identical outputs at every level. - C/Rust parity test —
tools/kn86fw/format/kn86slot.hcompile-time assertions (_Static_assert) match the Rust struct layout, mirroring the existingkn86fw.hparity test intests/integration.rs. tools/sd-provision/build.sh KN86_PRODUCE_SLOT_ARTIFACTS=1invokes the new producer and lands the artifacts undertools/sd-provision/build/slot-artifacts/. Existing default flow (no env var) unchanged.docs/device/os/update-system.mdgains a “Slot artifacts” subsection naming the producer subcommands and the on-disk artifact layout. Cross-references back to this design pack.- CI workflow (
.github/workflows/system-image-build.yml) uploads the slot-artifact triplet alongside the existing.img. (Stage 0 bring-up already enables artifact upload persystem-image-build.md; this is an additive matrix entry.)
Edge cases (≥3)
Section titled “Edge cases (≥3)”- Source
.imgis a partial / truncated build. A killed pi-gen leaves a partial.imgindeploy/.splitmust read the partition table first and refuse to extract any partition that runs past the end of the input file, with a clear error citing the partition number, declared range, and actual file length. Don’t silently produce truncatedbootfs.img. - Partition table doesn’t match ADR-0011’s six-partition layout. A user feeds
kn86fw splitan arbitrary.img(e.g., raw Raspberry Pi OS Lite they downloaded). The splitter must validate the table against the ADR-0011 expected shape (6 partitions, expected sizes ±10%, expected filesystem types) and fail loud if it doesn’t match — with a hint pointing atsystem-image-build.md. Refuse to extract unless the layout matches; don’t fall back to “extract whatever’s in p2 and p4,” because the wrong filesystem type in the wrong slot would brick a device. - Partition images larger than the .kn86fw payload size we want to ship. A
rootfs.imgis ~2 GB; a full bundle is ~2.25 GB. The desktop flasher uploads this over USB in a few seconds, which is fine for v1, but theproduce-update-bundlecommand should print the resulting bundle size and warn if it crosses (configurable) 3 GB. Compression is out of scope for v1 (the research brief notes the format is “gzipped ext4/FAT,” but ADR-0011’s actually-implemented.kn86fwcarries raw bytes; revisit if bundle size becomes a transport problem). - Concurrent split runs writing to the same output dir. The producer must either lock the output dir or refuse to overwrite an existing non-empty directory unless
--forceis passed. Avoid themanifest.tomlfrom one run getting paired withbootfs.imgfrom another. Standard Rust file-creation semantics + an explicit existence check before write. - A
.kn86fwproduced byproduce-update-bundleis fed to the Phase 0 path. The outerverifymust succeed (the SHA-256 covers the wholeKN86SLOTblob, which is opaque to the outer header).inspectmust print the outer header normally; the user invokesinspect-slot(or eventually a smart auto-detect oninspect) to crack open the inner format. Confirm the existinginspectdoesn’t try to parse beyond the header. - Pi firmware files in p1 (the common boot region). The producer extracts only bootfs (p2) and rootfs (p4) of the active slot at split time; p1 (
autoboot.txt, bootcode, common stage-1) is never part of an update bundle, because the field-update flow per ADR-0011 doesn’t touch p1 — it only writes the inactive slot’s bootfs + rootfs and rewritesautoboot.txt’stryboot_a_bline in place. Document this explicitly in the README to head off future “why isn’t p1 in the bundle” questions.
Engineering hand-off notes
Section titled “Engineering hand-off notes”- Owner when implementation runs: Platform Engineering (Rust + system-image expertise). Single engineer, ~3–5 days end-to-end (split + bundle + tests + docs + CI step). Not a sprint-blocker once unblocked.
- Files this design pack expects to land:
- New:
tools/kn86fw/src/{partition_table.rs, split.rs, slot.rs, cmd_split.rs, cmd_produce_bundle.rs, inspect_slot.rs}andtools/kn86fw/format/kn86slot.h. - Edit:
tools/kn86fw/src/{main.rs, lib.rs},tools/kn86fw/Cargo.toml(deps if any),tools/kn86fw/README.md(Wave 2 section),tools/sd-provision/build.sh(optional post-step),docs/device/os/update-system.md(slot-artifacts subsection),.github/workflows/system-image-build.yml(artifact-upload matrix). - Untouched:
tools/kn86fw/src/{header.rs, cmd_build.rs, inspect.rs, verify.rs}andtools/kn86fw/format/kn86fw.h.
- New:
- Test strategy: TDD. Start with
partition_table.rsagainst a tiny hand-crafted MBR/GPT fixture; thensplit.rsagainst a fixture.img(a 64 MB toy with the ADR-0011 partition table and known-byte-pattern partitions); thenslot.rsround-trip; then end-to-end viaintegration_split.rs+integration_bundle.rs; finallyintegration_repro.rs. - Spec hygiene reminders for the implementation PR:
- Spec Hygiene Rule 1 — do not restate the ADR-0011 partition layout in
partition_table.rscomments or the README. Reference the ADR. - Spec Hygiene Rule 3 — when
KN86SLOTlands, search the repo for stale “monolithic .img” references in docs and update them in the same PR. Likely candidates:docs/device/os/update-system.md,tools/kn86fw/README.md’s “Phase 0 scope” callout.
- Spec Hygiene Rule 1 — do not restate the ADR-0011 partition layout in
- What this pack is NOT a license for: writing any of the above modules now. Per the GWP-163 gate, design only.
Open questions for Josh
Section titled “Open questions for Josh”- Inner-format wire layout. The
KN86SLOTblob’s exact byte layout (field widths, alignment, ordering) is left for the implementation PR. Recommendation: 16-byte header + 56-byte section entries + 4 KiB-aligned section payloads, mirrored inkn86slot.hwith_Static_assert. Confirm this is OK to lock at implementation time, or call it now. - Compression. Research brief §“Firmware image format” suggests gzipped slot images; the actually-shipped
.kn86fwis raw. Recommendation: stay raw for v1 (~2.25 GB bundles transfer over USB in seconds; compression adds determinism risk). Revisit if a hosted-update channel needs the bandwidth savings. Confirm. gptcrate vs hand-rolled partition reader. Recommendation: hand-rolled. The ADR-0011 layout is fixed and a ~120-line reader keeps the dependency surface small (no transitive deps, no MSRV churn). Confirm.- Auto-detect inner format on
kn86fw inspect. Shouldinspectpeek at the payload’s first 8 bytes and, if they’reKN86SLOT, print slot-aware output by default — or keepinspect-slotas a separate explicit subcommand? Recommendation: keepinspect-slotseparate for v1 (less magic, easier to test); revisit when there’s a second inner format to disambiguate. - CI artifact retention. The slot-artifact triplet adds ~2.25 GB per release tag to the private monorepo’s release assets. GitHub’s per-release storage is generous but not infinite. Recommendation: keep the last 5 releases of slot artifacts, prune older. Confirm cadence + retention.
- Does the producer also emit a “p1 common boot region” image? Recommendation: no for v1 (p1 isn’t part of any A/B update flow per ADR-0011), but flag for the implementation PR in case a future “rebuild p1 only” workflow surfaces. Confirm.