Boot and systemd

What happens between power-on and the first frame of nOSh content on the primary display: the Linux boot flow on the Pi Zero 2 W, the systemd unit graph for the nOSh runtime daemon and its peripheral siblings, ordering / dependency rules, restart policy, and log retention. Read this if you are adding a new systemd unit, debugging a slow boot, or wondering why nOSh refuses to start.

system-image-build.md — where these unit files get baked in (stage-kn86-runtime).
device-tree-overlays.md — overlays that must be live before the dependent units start.
kiosk-mode.md — getty/auto-login config that this graph assumes.
coprocessor-firmware.md — Pico 2 firmware whose UART handshake gates the nOSh unit start.
../../software/runtime/orchestration.md — what the nOSh process does once systemd hands it control.

Linux boot flow

The Pi Zero 2 W boots through a five-stage chain. We do not touch the first stage; the first three stages are vendor-supplied and pi-gen-provisioned (system-image-build.md).

power-on
  -> VideoCore GPU bootloader (vendor firmware on the SoC, untouchable)
  -> reads /autoboot.txt from p1, picks active bootfs slot per ADR-0011
  -> loads start.elf + kernel8.img + DTB + cmdline.txt + config.txt from p2 or p3
  -> Linux kernel boots
  -> systemd starts as PID 1, target = multi-user.target (NOT graphical.target)
  -> systemd brings up the unit graph below
  -> kn86-nosh.service runs nOSh as the kiosk user
  -> nOSh opens SDL3, renders boot animation on the Elecrow primary display

There is no display manager, no X server, no Wayland compositor. nOSh owns the framebuffer directly via SDL3’s KMSDRM backend (kiosk-mode.md).

Unit graph

systemd-tmpfiles-setup.service
        |
        v
systemd-udevd.service ----+----> kn86-cartridge-mount.path  (subscribes to USB-MSC udev)
                          |
                          +----> kn86-coprocessor.service   (waits for /dev/serial0)
                          |             |
                          +----> kn86-display-init.service  (DPMS on, vt1 console clear)
                                        |
                                        +-----------------+
                                                          v
                                                kn86-nosh.service
                                                          |
                                                          +-- (nOSh runtime — see orchestration.md)

Boot wall-clock target: kernel handoff to first nOSh frame in <8 s on a warm SD. ADR-0011’s USB-enumeration wait for the early-boot key gate adds ~2 s in the not-holding-the-combo case; that is included in the 8 s budget.

Unit file inventory

All units live at /etc/systemd/system/kn86-*.service (or .path), installed by stage-kn86-runtime of the system-image build.

Unit	Type	Job
`kn86-display-init.service`	oneshot	Clears the kernel console on tty1, sets DPMS on, configures the framebuffer mode for the Elecrow per `device-tree-overlays.md`.
`kn86-coprocessor.service`	simple	Owns `/dev/serial0`. Performs the HELLO + VERSION handshake with the Pico 2 per `coprocessor-firmware.md` and the coprocessor-protocol.md §5.3 bootstrap. Stays alive as the userspace daemon that nOSh talks to over a Unix socket at `/run/kn86/coproc.sock`.
`kn86-cartridge-mount.path`	path	Watches `/dev/disk/by-id/usb-KN86CART` ; activates `kn86-cartridge-mount@.service` instances on insertion (`../../software/runtime/cartridge-lifecycle.md`).
`kn86-cartridge-mount@.service`	template (instantiated per device)	Mounts the cart’s filesystem read-only at `/mnt/cart` and emits a sd_notify message that nOSh picks up.
`kn86-updater-gate.service`	oneshot, before nOSh	ADR-0011 attention-gesture scan. Reads `/dev/input/event*` for ~2 s after USB HID enumerates; if SYS+LINK held, kexec into the updater image; otherwise exit 0.
`kn86-nosh.service`	simple	Launches the nOSh binary as user `nosh`. Owns the primary display, the OLED via the coprocessor daemon, and the input event loop.

Service ordering

Ordering is expressed with After= / Requires= / Wants= as follows:

[Unit]
Description=KN-86 Pico 2 coprocessor daemon (UART, audio, OLED bridge)
After=systemd-udev-settle.service
Wants=systemd-udev-settle.service
ConditionPathExists=/dev/serial0

[Service]
Type=simple
ExecStart=/opt/nosh/bin/kn86-coproc-daemon /dev/serial0
Restart=on-failure
RestartSec=2s
StartLimitIntervalSec=30s
StartLimitBurst=5

[Unit]
Description=KN-86 nOSh runtime
After=kn86-display-init.service kn86-coprocessor.service kn86-updater-gate.service
Requires=kn86-coprocessor.service
Wants=kn86-display-init.service
ConditionPathExists=/run/kn86/coproc.sock

[Service]
Type=simple
User=nosh
Group=nosh
ExecStart=/opt/nosh/bin/nosh
Restart=on-failure
RestartSec=3s
StartLimitIntervalSec=60s
StartLimitBurst=3

Requires=kn86-coprocessor.service means: if the Pico handshake fails repeatedly and kn86-coprocessor.service enters failed state, nOSh refuses to start at all. The user sees the failure on Row 24 of the primary display via the early kn86-display-init.service writing a fallback message to the Linux text console (which Row 0/24 of nOSh is not yet drawing over because nOSh hasn’t started). This is the operator-visible hard fail the coprocessor protocol §5.3 specifies.

ConditionPathExists=/dev/serial0 on the coprocessor daemon and /run/kn86/coproc.sock on nOSh prevents pointless restart loops when the device-tree overlay hasn’t applied (see device-tree-overlays.md for the UART0 overlay).

Restart policy

Unit	Restart=	RestartSec	Burst	Window	Rationale
`kn86-coprocessor.service`	`on-failure`	2 s	5	30 s	Transient UART glitches recover on restart; persistent failure is a hardware problem and hammering doesn’t help.
`kn86-nosh.service`	`on-failure`	3 s	3	60 s	nOSh segfault is rare; if it happens 3× in a minute, leave the service in `failed` and let the user power-cycle.
`kn86-display-init.service`	`no`	—	—	—	Oneshot; if it fails, the console message from systemd is enough.
`kn86-updater-gate.service`	`no`	—	—	—	Oneshot pre-nOSh gate; failure here is logged and boot proceeds (better to enter nOSh than refuse to boot).

When the burst rate-limit fires, nOSh ends up in failed state and the framebuffer holds whatever the display-init unit drew. Recovery is a power-cycle. Production mode disables the SSH path, so systemctl restart from the bench rig is dev-mode only (kiosk-mode.md).

Wait conditions

The graph relies on three implicit wait gates:

/dev/serial0 exists. Created by the UART0 device-tree overlay (device-tree-overlays.md); the coprocessor daemon’s ConditionPathExists= polls until it appears.
USB HID enumeration. The updater-gate scan reads /dev/input/event*. If the keyboard controller hasn’t enumerated through the internal USB hub yet (per ADR-0018), the scan window quietly returns “no key held” and boot continues — false negative is acceptable, false positive (entering updater on no key press) is not.
/run/kn86/coproc.sock exists. The coprocessor daemon creates it after the Pico HELLO+VERSION handshake clears. nOSh’s ConditionPathExists= waits on it; this is what produces the “nOSh blocks on Pico ready” behavior the coprocessor protocol §5.3 specifies.

Cmdline.txt baseline

The kernel command line for each bootfs slot is written by stage-kn86-firmware/00-kn86-firmware/01-run.sh at image-build time. It is the single source of truth for cmdline.txt flags — do not add flags elsewhere without updating that script and this section.

Prod slot flags

console=serial0,115200 console=tty1 root=PARTUUID=... rootfstype=ext4
fsck.repair=yes rootwait
quiet loglevel=3 vt.global_cursor_default=0 logo.nologo consoleblank=0

The KN-86-specific flags appended to the pi-gen default line:

Flag	Purpose
`quiet`	Suppresses kernel messages printed to the framebuffer during boot. Without this, kernel printk output overwrites the boot splash on the primary display. Drop this flag in dev mode (see below).
`loglevel=3`	Limits in-kernel log ring-buffer to errors only. Complements `quiet`; belt-and-suspenders so that even if `quiet` is parsed after a tty switch, only genuine errors surface.
`vt.global_cursor_default=0`	Hides the blinking text cursor on all virtual terminals. The cursor is visible on tty1 between boot and the moment nOSh claims the framebuffer via KMSDRM; removing it avoids a cosmetic flicker on the primary display.
`logo.nologo`	Suppresses the Raspberry Pi rainbow splash that the GPU firmware paints on the framebuffer before the kernel starts. Keeps the primary display black until the kn86-boot-splash service fires.
`consoleblank=0`	Disables the Linux kernel’s built-in console blanker. nOSh owns screen-blank timing via its own idle clock and DPMS-off through SDL3/KMSDRM (see `power-idle.md`). Double-blanking — kernel blanker racing nOSh’s blanker — would produce unpredictable wake behaviour.

Dev mode

When the image is built with KN86_BUILD_MODE=dev (set in the host build environment), quiet is omitted from the flag set. Kernel messages remain visible on the framebuffer, which is useful during bring-up debugging. All other flags (loglevel=3 vt.global_cursor_default=0 logo.nologo consoleblank=0) remain active in dev mode.

Overlayroot

When KN86_OVERLAYROOT=1 is set at build time, init=/init-overlay is prepended to the KN-86 flag set (before quiet). This is the standard mechanism for enabling overlayroot via cmdline.txt — the overlay init script intercepts PID 1 before systemd and sets up the read-only root bind-mounts. See kiosk-mode.md for the overlayroot strategy.

Idempotency

The stage script uses consoleblank=0 as a sentinel: if it is already present in cmdline.txt (e.g. a stage re-run), the entire flag block is skipped. This prevents duplicate flags on repeated builds.

Journald + log retention

Per kiosk-mode.md, journald runs in volatile mode by default — logs live in /run/log/journal/ (tmpfs, capped 32 MB total / 8 MB free reserved) and clear on reboot. This is intentional: the read-only-root kiosk filesystem has no good place to keep persistent logs, and a kiosk device leaking SD writes to a log is a worn-flash risk.

In developer mode (the /boot/kn86-mode.txt flag from ../hardware/build-specification.md §5), journald flips to persistent mode and writes to /var/log/journal/ on the rootfs. Log size capped at 100 MB total / 30 MB per service. Useful for journalctl -u kn86-nosh -f from an SSH session during a debug run.

A bench rig that needs to capture logs in production mode for a one-off bug should toggle to dev mode for the repro, capture, then toggle back — there is no in-place “make this prod-mode boot persistent” knob, and adding one would defeat the kiosk-write-discipline.

nOSh app-level log rotation (GWP-354)

nOSh does not write its own log files. The runtime emits all diagnostics to stdout / stderr; systemd captures both streams into the journal under kn86-nosh.service. There is no /var/log/kn86-*, no logrotate config, no log4j-shaped rotation worry. The journald caps above ARE nOSh’s effective log retention policy.

This is deliberate: a separate log file would either (a) write to the read-only rootfs in production (bad), (b) write to /home/shared (burns SD writes on operator-state-only territory), or (c) need its own rotation/cleanup machinery. journald already solves all three problems for free, with journalctl -u kn86-nosh -f as the standard debug surface.

If a future cart or sub-system genuinely needs a structured event log (e.g., a multi-day session log a player can review on their next boot), it lives in /home/shared/<cart_id>/ per ../../adr/ADR-0019-cartridge-storage-and-form-factor.md

Universal Deck State semantics, not as an OS-level log.

Time and clock

Pi Zero 2 W has no battery-backed RTC. The boot-time clock provenance chain (GWP-349) is:

fake-hwclock restores /etc/fake-hwclock.data (last-known wall-clock at the previous shutdown) very early, before the unit graph proper. The package is preinstalled in stage-kn86-base.
systemd-timesyncd (dev mode only — masked in production) starts after Wi-Fi comes up and jumps the clock to NTP truth. NTP sources: pool.ntp.org primary, time.cloudflare.com + time.google.com fallback. Aggressive poll interval (PollIntervalMinSec=16) for fast convergence on a stale boot.
fake-hwclock writes /etc/fake-hwclock.data on shutdown so the next cold boot has a starting point.

fake-hwclock.data lives on the rootfs (/etc/fake-hwclock.data). On a read-only-root production image the file is read-only at runtime and the periodic save fails — that is fine: the file was already written by the build, so first boot starts at the build timestamp, and subsequent runtime drift is accepted.

Production mode has no NTP — systemd-timesyncd.service is masked and Wi-Fi is off (kiosk-mode.md “Network stack”), so the clock will drift between cold boots in production. This is accepted: nOSh timestamps are advisory, not load-bearing for any gameplay or save semantics. kn86-nosh.service does NOT depend on time-sync.target.

Hardware watchdog (GWP-352)

The closed-lid Pelican kiosk has no externally accessible reset button; if userspace wedges, the BCM2837 hardware watchdog is the only hang-recovery path. Two pieces:

Firmware enable. dtparam=watchdog=on in /boot/firmware/config.txt exposes /dev/watchdog to userspace. See device-tree-overlays.md “Hardware watchdog enable”.
systemd consumption. /etc/systemd/system.conf.d/10-kn86-watchdog.conf:
```
[Manager]
RuntimeWatchdogSec=15s
RebootWatchdogSec=120s
```
RuntimeWatchdogSec=15s means systemd pings /dev/watchdog every ~7.5 s; if PID-1 wedges for 15 s the BCM hardware fires a reset. RebootWatchdogSec=120s bounds the worst case for “graceful shutdown chain itself hangs” — if reboot takes longer than 120 s, the watchdog reboots us anyway.

Per-service watchdog wiring (`kn86-nosh.service`)

kn86-nosh.service ships with per-service watchdog integration (landed in jschairb/kn86-deckline#225):

[Service]
Type=notify
NotifyAccess=main
WatchdogSec=10

WatchdogSec=10 means systemd expects a WATCHDOG=1 ping from the nOSh process at least every 10 s (the recommended interval is ≤ WatchdogSec/2, i.e., ≤ 5 s). If the ping window expires, systemd restarts the service before the system-level RuntimeWatchdogSec=15s fires a full reboot — giving nOSh one restart attempt at the service boundary before escalating to a board reset.

Type=notify + NotifyAccess=main means nOSh must also send READY=1 via sd_notify() when it is ready to serve. This replaces the earlier Type=simple entry in the unit file and ensures systemd knows exactly when nOSh is up, not just started.

Runtime implementation (kn86-emulator/src/watchdog.c): The nOSh runtime calls kn86_watchdog_init() before the event loop (sends READY=1) and kn86_watchdog_tick(current_tick) at the frame boundary inside frame_step() just before render_frame(). The tick function is rate-limited to at most one WATCHDOG=1 ping per 5 s so it does not flood the socket. Any hang in event dispatch or runtime tick blocks the ping and expires the 10 s window.

Both functions are #ifdef __linux__-guarded; on macOS the desktop emulator build is a no-op (no libsystemd dependency required for macOS builds).

Manual on-device test procedure: To verify the watchdog fires on a live device (requires dev-mode SSH access per kiosk-mode.md):

# 1. Find the nOSh PID.
systemctl status kn86-nosh.service   # note the Main PID

# 2. Suspend the process to simulate a hang.
kill -SIGSTOP <nosh-pid>

# 3. Wait 12–15 s (past WatchdogSec=10).

# 4. Observe systemd restart the service.
journalctl -u kn86-nosh.service -f
# Expect: "kn86-nosh.service: Watchdog timeout (limit 10s)!"
#         "kn86-nosh.service: Killing processes…"
#         "Started KN-86 nOSh runtime."  (restart)

The service will restart once on its own. If it hits the burst limit (3 restarts in 60 s), it enters failed state and requires a manual systemctl start kn86-nosh.service or a power-cycle to recover. See Restart policy above for the configured limits.

Relation to the system-level watchdog: RuntimeWatchdogSec=15s (configured in /etc/systemd/system.conf.d/10-kn86-watchdog.conf per the section above) fires a full board reset if PID-1 itself wedges. The per-service WatchdogSec=10 catches nOSh-specific hangs at the service boundary first, giving the system a restart-without-reboot recovery path for the common case. A runtime-level deadlock that keeps PID-1 responsive but leaves nOSh unresponsive is caught by the per-service window; a PID-1 hang is caught by the board-level RuntimeWatchdogSec.