Skip to content

Audio Pipeline

How a cartridge audio primitive call travels from cart-side Lisp to audible sound — from the psg-tone FFI invocation through the coprocessor bridge to the YM2149 emulator and the MAX98357A speaker. Single source for the audio rendering path on both emulator and prototype targets.

Related:

  • adr/ADR-0017 — commits the Pico 2 coprocessor as YM2149 PSG synthesizer and I2S output driver. Read first.
  • adr/ADR-0005 — NoshAPI FFI surface §Audio: the cart-facing primitives (psg-tone, psg-noise, psg-envelope, psg-silence).
  • kn86-emulator/src/psg.c, psg.h — YM2149 emulator (14 registers, 3 tone + noise + envelope).
  • kn86-emulator/src/sound.c, sound.h — SDL3 audio callback, PSG sample loop, nosh_psg_* wrappers.
  • kn86-emulator/src/coproc.c, coproc.h — coprocessor vtable; routes PSG commands in-process (emulator) or over UART (device).
  • coprocessor-bridge.md — the Pi-side UART daemon that owns /dev/serial0 on the prototype.
  • docs/software/api-reference/grammars/coprocessor-protocol.md — wire format for PSG_REG_WRITE, PSG_RESET, PSG_BULK_WRITE.

Cartridges never write PSG registers directly. They call NoshAPI FFI primitives from their Lisp source, which are documented in ADR-0005 §Audio. The canonical set:

PrimitiveDescription
(psg-tone channel freq volume)Set a tone on channel 0–2 at a given frequency (Hz) and volume (0–15).
(psg-noise period)Enable noise generator at the given period (0–31).
(psg-envelope shape period)Set envelope shape (0–15) and 16-bit period.
(psg-silence)Disable all channels, zero all amplitudes.
(psg-reg-write reg val)Low-level register write (0–13). For advanced use only.

Each of these resolves to one or more nosh_psg_write() calls in sound.c (e.g., psg-tone writes the period registers, amplitude register, and mixer register — typically 4 calls). The primitives are documented in ADR-0005; this doc covers only the pipeline below them.


The path from a primitive call to audible sound passes through four layers regardless of target:

Cart Lisp primitive
nosh_psg_write() / nosh_psg_reset() [sound.c]
│ dispatches via CoprocessorAPI vtable
EMULATOR PATH PROTOTYPE PATH
│ │
inprocess_dispatch() coproc_send() → UART frame
[coproc.c] → /dev/serial0 → Pico 2
│ │
psg_write() [psg.c] Pico: YM2149 emulation
│ │
SDL audio callback Pico: I2S out → MAX98357A
[sound.c] │
│ 28mm 8Ω speaker
44.1 kHz PCM stream
→ SDL audio device
→ system speaker

The vtable seam is the key abstraction. sound.c holds a CoprocessorAPI *coproc pointer captured at sound_init(). Every nosh_psg_write() call dispatches through coproc->psg_write(reg, val) when the vtable is bound. On the emulator the vtable contains emu_psg_write() trampolines that call psg_write() directly. On the prototype the vtable contains UART-marshaling trampolines. Call sites never observe the swap.

coproc.c runs in COPROC_MODE_INPROCESS by default. coproc_send() builds a v0.2 wire frame (for protocol-level validation), then calls inprocess_dispatch() which routes on the frame type:

  • COPROC_FT_PSG_REG_WRITE (0x20): calls psg_write(g_emu_psg, reg, val).
  • COPROC_FT_PSG_RESET (0x21): calls psg_reset(g_emu_psg).
  • COPROC_FT_PSG_BULK_WRITE (0x22): calls psg_write() 14 times (one per register).

The SDL audio callback in sound.c runs on a separate OS thread at 44.1 kHz cadence. Each callback invocation calls psg_sample(&state->psg) for each of the 1024 samples in the buffer. The psg_sample() function advances the tone counters, noise LFSR, and envelope counter, then sums the three channel outputs through a logarithmic amplitude table and converts to a signed 16-bit mono PCM sample.

The audio callback reads directly from the same PSGState struct that psg_write() modifies. There is no lock or double-buffer between them — the writes are structurally atomic for the register types the YM2149 uses, and audio artifacts from torn writes are acceptable in v0.1.

On the device build, coproc_send() runs in COPROC_MODE_UART. After building the v0.2 frame, it writes the frame bytes to /dev/serial0 (UART0, GPIO14/15, 1 Mbps 8N1 per ADR-0017 §4). The Pico 2 receives the frame, validates CRC-16/CCITT-FALSE, and dispatches PSG_REG_WRITE to its own YM2149 emulation state.

The Pico 2 runs the YM2149 synthesis loop and outputs 44.1 kHz mono PCM over I2S to the MAX98357A Class-D DAC/amplifier, which drives the 28mm 8Ω 2W speaker.

Critical distinction: audio signal leaves the Pi over UART as register-write commands, not as PCM samples. The Pi does not have a direct audio output path to the speaker in the v1.0 architecture. The Pico owns the entire synthesis-to-speaker chain. Pi-side ALSA or PulseAudio is not involved in normal operation.

Open question (GWP-171): if a future PCM voice-bark feature (non-YM2149 audio) is added, the architecture team needs to decide whether the Pi gains a direct audio path or whether PCM is relayed over UART to the Pico. ADR-0017 does not resolve this — flag to platform engineering before implementing GWP-171.


The YM2149 has 14 registers (0–13). PSGState in types.h and the register layout in psg.c are the implementation-level sources of truth. High-level summary:

RegistersFunction
0–1Channel A tone period (12-bit, little-endian)
2–3Channel B tone period
4–5Channel C tone period
6Noise period (5-bit)
7Mixer: bits 0–2 = tone enable (0=on), bits 3–5 = noise enable (0=on)
8–10Channel A–C amplitude: bits 0–3 = level, bit 4 = use envelope
11–12Envelope period (16-bit, little-endian)
13Envelope shape (4-bit: CONTINUE, ATTACK, ALTERNATE, HOLD)

Clock model: psg.c uses a 2 MHz virtual master clock divided by 8, giving a 250 kHz tone clock. The fractional prescaler (prescale_accum, fixed-point 16.16) advances ~5.67 steps per 44.1 kHz sample. The 16-entry logarithmic amplitude table in psg_sample() approximates the YM2149’s DAC curve (roughly √2 per step).

Envelope shapes: 16 shapes encoded by the 4-bit register. The envelope_level() helper in psg.c implements the correct YM2149 shape mapping with CONTINUE/ATTACK/ALTERNATE/HOLD semantics. Shapes 0x00–0x07 are single-cycle (no CONTINUE); shapes 0x08–0x0F are continuous.

Noise: 17-bit LFSR with feedback from bits 16 and 13, updated at the noise-period rate.


From PSG_REG_WRITE call in nOSh to audible tone at the speaker (prototype path). Values from the coprocessor protocol spec §7:

StageTime
Pi userspace serialises frame (memcpy + CRC)≤ 50 µs
Pi UART TX → wire (8 bytes at 1 Mbps 8N1)≤ 80 µs
Pico UART RX → parse → CRC check → dispatch≤ 200 µs
Pico writes value to PSG register state≤ 1 µs
Audio sample period (1 / 44100 Hz)22.7 µs
I2S output buffer drain (256-sample double-buffer)5–12 ms
MAX98357A → speaker< 1 ms

Typical sum: ~9 ms. Worst-case: ~12.5 ms. ADR-0017 §Known Unknowns #5 set a <30 ms target; the budget leaves ~17 ms headroom. If bring-up measures actual latency above 20 ms, investigate the Pico’s I2S buffer depth — see the coprocessor protocol spec §7 for the full analysis.

On the emulator the UART hop is eliminated; latency is dominated by the SDL audio buffer (~23 ms at 44100 Hz / 1024 samples). This is higher than the device path for PSG register effects but acceptable for development.


On the emulator, sound_init() in sound.c calls SDL_OpenAudioDevice() with a 44.1 kHz mono 16-bit spec and a 1024-sample buffer. The audio callback audio_callback() runs on the SDL audio thread and fills the buffer by calling psg_sample() repeatedly. SDL_PauseAudioDevice(device, 0) starts playback immediately after init.

On the prototype, audio output is entirely Pico-side:

  • The Pico firmware runs a 44.1 kHz I2S output loop, synthesizing PSG samples in its own YM2149 emulation state.
  • The Pi does not call sound_init() or use SDL audio on the device build — or if it does, that SDL path is not connected to the physical speaker.
  • The Pi daemon’s sole audio responsibility is sending register-write UART frames promptly.

sound_shutdown() closes the SDL audio device. It is called at emulator exit; no prototype equivalent is needed.


sfx.c / sfx.h provide a small library of named sound effects (SFX_TYPING, SFX_BOOT_HUM, etc.) implemented as sequences of psg_write() calls. The nosh_sfx_play(cue_id) wrapper in sound.c delegates to sfx_play(). Cues are baked C data — not Lisp-authored — because they need to fire in response to firmware events (keyclick, boot) before any cartridge is loaded.

Cart-side Lisp can trigger SFX cues via the (sfx-play cue-id) FFI primitive (ADR-0005). Custom cartridge music is authored with direct psg-tone/psg-envelope calls rather than the cue system.