KN-86 Voice Bark Recording Guide
Architecture note (ADR-0017): The v1.0 guide described packing bark samples as 4-bit/8kHz data for playback through the YM2149’s Channel C amplitude register. That model is obsolete. ADR-0017 moved YM2149 synthesis to the Pico 2 coprocessor; the MAX98357A receives 16-bit PCM over I2S directly. The
kn86barktool now outputs 16-bit signed PCM at 22,050 Hz. Recording and delivery guidance below remains valid; only the encoding pipeline and storage format differ from v1.0. Seedocs/software/runtime/pcm-voice-bark.mdsection 2 for the full technical revision.
Overview
Section titled “Overview”The kn86bark tool converts WAV files into 16-bit PCM data for playback through the Pico 2 coprocessor’s I2S audio path (MAX98357A DAC/amplifier). This guide covers how to record voice barks and sound effects that sound right at 22 kHz / 16-bit.
The lo-fi aesthetic is a choice, not a constraint — the Pico/I2S path can deliver clean speech. Lean into the gritty character deliberately.
Quick Start
Section titled “Quick Start”# Build the toolcd kn86-emulator/build && cmake .. && make kn86bark
# Convert a WAV (outputs 22 kHz / 16-bit PCM)./bin/kn86bark ../assets/barks/breach.wav breach.pcm --normalize --previewThis produces:
breach.pcm— raw signed 16-bit little-endian PCM at 22,050 Hz- A TOML snippet for inclusion in the cartridge’s
audio/barks/barks.tomlindex
Part 1: Voice Barks
Section titled “Part 1: Voice Barks”Equipment
Section titled “Equipment”- Microphone: Any USB condenser or dynamic mic. An SM58 is fine. Even a laptop mic works for prototyping.
- Software: Audacity (free), GarageBand, Logic, Ableton, or any DAW that exports WAV.
- Room: Quiet room, close-mic distance (4-6 inches). Don’t worry about room treatment — the character of the 28mm speaker through the Pelican case will dominate anyway.
Delivery Style
Section titled “Delivery Style”The Cipher voice is a “competent colleague” — terse, clipped, authoritative. Think military radio operator or air traffic controller.
Do:
- Bark the word. Short, punched, declarative.
- Use hard consonants: B, D, K, T, CH, P
- Keep it between 0.3 and 0.7 seconds. If you can’t say it in under a second, pick a shorter word.
- Speak with conviction. The voice should sound like it has authority.
Don’t:
- Don’t use a friendly or conversational tone.
- Don’t whisper — even at 16-bit you need amplitude to punch through a small speaker.
- Avoid sibilants: S, SH, F, TH — these are less punched through a 28mm speaker. Hard consonants are still preferred.
- Avoid words that start with soft sounds. “SURFACE” works (hard S attack); “seven” doesn’t (soft S + low vowel).
Word Selection
Section titled “Word Selection”Words that work best at 22 kHz through a 28mm speaker:
| Great | Okay | Avoid |
|---|---|---|
| BREACH | SURFACE | SHIFT |
| CLEAN | LAUNCH | FINISH |
| BURNED | EXIT | ASSESS |
| CONTACT | VOID | THESIS |
| DEPTH | MATCH | FRESH |
| TRACED | AUDIT | SOFT |
| BLOCKED | ROUTE | FEATHER |
| PATROL | FLAGGED | WHISPER |
The pattern: plosive consonants (B, D, K, T, P) at word boundaries, strong vowels in the middle, short duration.
Recording Settings
Section titled “Recording Settings”| Setting | Value |
|---|---|
| Sample rate | 44.1 kHz or 48 kHz |
| Bit depth | 16-bit or 24-bit |
| Channels | Mono (or stereo — kn86bark will mixdown) |
| Format | WAV (uncompressed) |
DAW Processing Chain
Section titled “DAW Processing Chain”Apply these effects in order before exporting:
- Trim silence. Leave ~50ms of silence at head and tail. Remove breaths.
- Normalize to -1dB. Get the signal hot without clipping.
- High-pass filter at 200Hz. Removes room rumble, plosive pops, and low-frequency mud.
- Compress hard.
- Ratio: 8:1 or higher (limiting)
- Attack: fast (1-5ms) — catch the transient
- Release: fast (50-100ms)
- Threshold: set so you get 6-10dB of gain reduction
- This squashes the dynamic range so the bark punches through the 28mm speaker’s limited SPL range.
- Optional: bitcrush / distortion. Apply a light bitcrush (8-bit or 6-bit) in the DAW if you want the lo-fi aesthetic. This is a deliberate aesthetic choice —
kn86barkoutputs 16-bit and does not add quantization noise. - Export as WAV. 44.1kHz/16-bit mono.
Conversion
Section titled “Conversion”kn86bark voice.wav voice.pcm --normalize --preview--normalizescales the peak to the full signed 16-bit range. Always use this for voice.--previewplays back through your speakers so you can hear the result immediately.
Iterate
Section titled “Iterate”The feedback loop is:
- Record a take in Audacity
- Apply processing chain, export WAV
- Run
kn86barkwith--preview - Listen. Is the word recognizable?
- If not: try a different delivery (more punch, harder consonants), adjust compression, or pick a different word
- Repeat
The success criteria from the spec: “recognizable as the word on first hearing, without accompanying text.”
Part 2: Sound Effects
Section titled “Part 2: Sound Effects”Yes, This Works for SFX
Section titled “Yes, This Works for SFX”The kn86bark pipeline converts any WAV to 16-bit/22kHz PCM. Voice barks are the primary use case, but the format works well for mechanical, electronic, and percussive sound effects.
What Works Well
Section titled “What Works Well”Modem handshake / data transmission
- Record a real modem dialing sequence, or synthesize one (carrier tones at 1200/2400 Hz, FSK modulation). Audacity’s “Generate > Chirp” can approximate carrier negotiation.
- The 16-bit path is honest — if you want authentic digital grit, use
--bitcrushin your DAW before export rather than relying on quantization noise. - Best duration: 0.5-1.0 seconds of the initial handshake screech. Not the full negotiation — just the iconic opening burst.
Hard drive seeking / knocking
- Option A: Put a contact microphone on an old spinning hard drive and record seek operations. Copy a large directory to trigger sustained seek activity.
- Option B: Synthesize it. Short noise bursts (10-30ms) with a resonant bandpass filter around 800-1200 Hz, repeated at irregular intervals. Audacity’s noise generator + envelope tool can do this.
- The clicking, mechanical character survives the 22 kHz path cleanly — these sounds are already bandwidth-limited and transient.
- Best duration: 0.3-0.5 seconds of seek chatter.
Relay clicks / switching
- Record a physical relay (toggle a power strip on/off near a mic) or use a short noise transient with fast envelope.
- These are already nearly single-sample events — they map cleanly to any sample rate.
- Duration: 0.1-0.2 seconds.
CRT degauss / power-on hum
- Record a CRT monitor being powered on (the “thunk” of the degauss coil).
- Or synthesize: 60Hz hum (or 50Hz for PAL regions) with a decaying envelope, mixed with a short low-frequency impulse.
- Duration: 0.3-0.5 seconds.
Dot matrix printer
- Record or find a sample of a dot matrix head chattering across a page.
- The rhythmic, mechanical pattern translates well at 22 kHz.
- Duration: 0.5-1.0 seconds.
Typewriter / mechanical keyboard
- Record individual key strikes on a mechanical keyboard (Cherry MX Blue switches are great for this).
- Each strike is a sharp transient — ideal for PCM playback.
What Doesn’t Work
Section titled “What Doesn’t Work”| Sound Type | Why It Fails |
|---|---|
| Music / melodies | Use the PSG’s tone generators for music — they’re the right tool. |
| Sustained pads / drones | Duration mismatch: barks are capped at 1.0 second by design. |
| Speech sentences | Too long, too much tonal variation. Barks are single words for a reason. |
| Delicate / quiet sounds | The 28mm speaker and the Pelican case will flatten subtle dynamics regardless of bit depth. |
| Sounds relying on ultrasonic content | 22 kHz means nothing above 11 kHz is reproduced; most relevant content is well below this. |
SFX Processing Tips
Section titled “SFX Processing Tips”- Skip the high-pass filter for SFX that have important low-frequency content (relay thunks, drive motor rumble). The 200Hz HPF is for voice clarity.
- Use bitcrush in your DAW for digital/electronic sounds if you want quantization grit as an aesthetic — don’t rely on the pipeline to add it at 16-bit.
- Use a gentle fade-in/out on organic/mechanical sounds (drive seeks, relay clicks) where you want smooth boundaries.
- Try different sample rates.
--rate 44100gives maximum fidelity;--rate 11025gives something closer to the old 4-bit character (very lo-fi, potentially interesting for some SFX).
SFX Conversion Examples
Section titled “SFX Conversion Examples”# Modem handshakekn86bark modem.wav modem.pcm --normalize --preview
# Hard drive seekskn86bark hdd_seek.wav hdd_seek.pcm --normalize --preview
# Relay clickkn86bark relay.wav relay.pcm --normalize --preview
# Extra lo-fikn86bark modem.wav modem_lo.pcm --normalize --rate 11025 --previewPart 3: Using Converted Barks in Cartridges
Section titled “Part 3: Using Converted Barks in Cartridges”Add to barks.toml
Section titled “Add to barks.toml”[[bark]]label = "BREACH"file = "audio/barks/breach.pcm"rate = 22050vol = 1.0
[[bark]]label = "CLEAN"file = "audio/barks/clean.pcm"rate = 22050vol = 1.0
[[bark]]label = "MODEM"file = "audio/barks/modem.pcm"rate = 22050vol = 0.8Trigger from a Lisp Handler
Section titled “Trigger from a Lisp Handler”(on-event :node-compromised (fn (node) (bark-play "BREACH") (text-puts 0 12 "> NETWORK COMPROMISED")))SD Storage Budget (post-ADR-0019)
Section titled “SD Storage Budget (post-ADR-0019)”Bark files live as sidecar .pcm files on the cartridge SD filesystem alongside cart.kn86. There is no flash-region constraint (ADR-0019 replaced the on-cart flash model with USB-MSC SD storage).
| Content | Duration | Size at 22 kHz / 16-bit |
|---|---|---|
| Voice bark | 0.5 sec | ~22 KB |
| Modem screech | 1.0 sec | ~44 KB |
| HDD seek burst | 0.3 sec | ~13 KB |
| Relay click | 0.1 sec | ~4 KB |
| 8 barks typical | 0.5 sec avg | ~176 KB total |
At these sizes, storage budget is not a concern — any SD card has headroom to spare. See docs/software/runtime/pcm-voice-bark.md section 2D for the full filesystem layout.
Part 4: Troubleshooting
Section titled “Part 4: Troubleshooting”Word is unrecognizable:
- Try a harder delivery. More punch on the initial consonant.
- Pick a word with stronger consonants (BREACH instead of FREEZE).
- Check your compression — if the dynamics aren’t squashed, the quiet parts get lost under the speaker’s background noise floor.
- Try
--rate 44100for the full sample rate if 22 kHz sounds insufficient.
Too much noise / hiss:
- Your source recording has a low signal-to-noise ratio. Re-record closer to the mic.
- Apply noise reduction in your DAW before export.
- Make sure
--normalizeis on.
Clicks or pops at boundaries:
- Ensure ~50ms silence at head/tail of the WAV. Hard cuts at non-zero crossings cause clicks.
- Apply a 5ms fade-in and fade-out in your DAW.
Preview doesn’t play / crashes:
- SDL3 audio initialization issue. Check
SDL_GetError()output. - Some audio backends don’t support 22kHz natively — SDL3 should resample internally, but try
--rate 44100if 22kHz fails.