Skip to content

ADR-0009: Token Prediction v1 Ranking Model

Hardware retarget note (2026-04-21): Latency claims in this ADR were sized for RP2350 / Pico 2 (150 MHz Cortex-M33). The platform now targets Pi Zero 2 W (1 GHz Cortex-A53). The ranking model is target-independent and the budgets are more than satisfied on current hardware.

Date: 2026-04-14 Supersedes spike: former spikes/ADR-0002-token-prediction.md Context: The nEmacs editor needs to rank tokens for the predictive palette. This spike defines a static, grammar-aware ranking model for v1.


Goal: Given a cursor position in an s-expression, rank all possible next tokens such that the top 8 are the most useful for the player.

Strategy: Static model (no machine learning). Combines:

  1. Grammar filter (hard constraint): Only tokens that syntactically fit.
  2. Domain vocabulary boost: Cartridge-specific terms get priority.
  3. Buffer-local identifier boost: Recently defined identifiers.
  4. Recency weight: Tokens used in the last ~20 expressions.
  5. Popularity prior: Baseline weights for common forms.

Why static? Fast, deterministic, debuggable. Machine learning is post-launch.


RankTokens(
cursor_position: Cursor, // Where we are in the tree
context_stack: [Form], // Parent forms up to root
buffer_content: [Form], // All forms in current buffer
session_history: [Token], // Recent tokens this session
cartridge_vocab: [Term], // Domain vocabulary
local_bindings: [Identifier] // Let/lambda bindings visible here
) -> [(Token, Score), ...]

A token is legal at cursor position if it satisfies syntactic rules.

Rules by position type:

PositionLegal TokensIllegal
Function position (first element of list)Callable: function names, macros, lambdas, built-in ops (+, -, etc.)Data: literals (except quote), binding keywords
Argument position (non-first)Data: identifiers, literals, function calls, quoted formsDefinitions: defn, defstruct, etc.
Binding position (inside let/lambda params)Identifiers onlyFunctions, literals, forms
Root (empty buffer)Top-level forms: defn, defstruct, defdomain, defmissionBare data, lambdas

Pseudo-code:

function is_legal(token, position):
if position == FUNCTION_POS:
return token in callables()
elif position == ARGUMENT_POS:
return token not in (defn, defstruct, let, defdomain, defmission)
elif position == BINDING_POS:
return token is identifier or token in (:as, :when)
elif position == ROOT:
return token in top_level_forms()
else:
return true

Callable set (function position):

  • Builtins: if, let, lambda, cond, map, filter, fold, reduce, car, cdr, cons, null?, +, -, *, /
  • User-defined functions in buffer
  • Quoted values (for meta-evaluation)

Top-level forms (root):

  • defn, defstruct, defdomain, defmission, let, lambda, quote, etc.

For each legal token:

score = 0
// 1. Domain vocabulary boost
if token in cartridge_vocabulary:
score += DOMAIN_BOOST (=5)
// 2. Local binding boost (is it a locally-visible binding?)
if token in local_bindings:
score += LOCAL_BOOST (=3)
// 3. Recency weight (used recently in this session?)
if token in session_history[-20:]:
recency_position = session_history.rfind(token)
recency_decay = max(0, 10 - (len(session_history) - recency_position))
score += recency_decay (0–10, decays with age)
// 4. Popularity prior (built-in baseline)
if token in popularity_baseline:
score += popularity_baseline[token] (0–4)
// 5. Semantic fit bonus (is this token "right" for the context?)
if semantic_fit(token, context_stack):
score += SEMANTIC_BONUS (=1)
return score

Baselines (popularity_baseline):

TokenBaselineRationale
if+4Most common control flow
let+4Most common binding form
lambda+3Common in callbacks
defn+2Definition form (less frequent in arguments)
map+3Common higher-order function
car+2List navigation primitive
cdr+2List navigation primitive
cons+1Construction (less common)
quote+1Quoting (special)
nil+2Common literal
++2Arithmetic (common in mission code)
+1Arithmetic
>+2Comparison (mission logic)
=+2Equality (common)
Other builtins+0No baseline
User identifiers+0Ranked by recency/local only

Constants:

  • DOMAIN_BOOST = 5 (cartridge vocabulary dominates)
  • LOCAL_BOOST = 3 (locally visible identifiers are proximate)
  • SEMANTIC_BONUS = 1 (small tiebreaker)
  • RECENCY_WINDOW = 20 (recent history)

If two tokens have equal score, sort alphabetically (deterministic, readable).


Scenario: Player is writing an ICE Breaker scripted mission to filter nodes by threat level.

(defn select-targets (network min-threat)
(filter network (lambda (node) |))

Cursor is in the lambda body (argument to filter predicate).

Position analysis:

  • Cursor position: function position (first element of the lambda body).
  • Context stack: [lambda, filter, defn]
  • Visible bindings: {network, min-threat, node}
  • Cartridge vocabulary: {node, ice, threat-level, probe, extract, breach, …}

Candidates evaluated:

TokenLegalDomainLocalRecencyPopularitySemanticTotalRank
>Yes00+3 (used 3 ago)+2+1+6#1
threat-levelYes+5000+1+6#1
nodeYes+5+3+2 (used 2 ago)00+10#2
iceYes+50000+5#3
ifYes000+4+1+5#3
probeYes+5000+1+6#1 (tie)
min-threatYes0+300+1+4#4
carYes000+20+2#5
cdrYes000+20+2#5
lambdaNo(skip)
defnNo(skip)

Top 8 (sorted by score desc, then alphabetically):

  1. node (10)
  2. (6)

  3. probe (6) → alphabetically after > (> comes before p)
  4. threat-level (6) → alphabetically after probe
  5. ice (5)
  6. if (5)
  7. min-threat (4)
  8. car (2)

Palette shown to player:

[1]node [2]> [3]probe [4]threat-level
[5]ice [6]if [7]min-threat [8]car

Rationale for top tokens:

  • node: Highest score. Domain vocabulary + local binding + recency.
  • >: Comparison operator, commonly used to filter by threat level. Recency boost (just used in the function signature).
  • probe: Domain vocabulary matches ICE Breaker’s “probe” operation.
  • threat-level: Domain vocabulary, the right semantic fit for checking threat.

Player’s likely action: Presses 4 (threat-level). Inserts form (threat-level |). Cursor moves to argument position. Palette recomputes.


Some tokens are more semantically appropriate in certain contexts, even if their baseline is low.

Examples:

ContextTokenBonusReason
Mission script, node comparison=+2”Checking node IDs” is a common mission pattern
ICE Breaker, filtering listsfilter+3Domain-specific; common in node traversal
BLACK LEDGER, financial codedebit/credit+3Domain vocabulary (accounting)
Encryption contextcipher-grade+4Highly semantic for crypto operations

Implementation: A context-specific boost table, consulted during ranking.


These are representative snippets from actual launch-title missions. Token prediction is evaluated by eye.

(defn dangerous-nodes (network)
(filter network (lambda (node)
|))

Expected top token: Should include comparison operators (>, <, =) and domain terms (threat-level, ice).

Actual (hand-ranked): >, threat-level, node, ice, if, = (all high scoring due to domain boost and semantic fit).

(defn extract-payload (data-nodes)
(let ((target (find-critical data-nodes)))
(cond
((null? target) |)

Expected: Error recovery tokens (nil, false) and data-access terms (car, cdr).

Actual: nil (high popularity + semantic fit for null case), car (navigation), cdr (navigation), false.

(defn deploy-defense (ice-routine threat-level)
(map (authorized-nodes) (lambda (node)
(if (> (current-threat node) threat-level)
(activate-ice |)

Expected: Domain vocabulary (ice-routine, activate-ice, lockdown), locals (threat-level), builtins (cons for constructing ICE config).

Actual: ice-routine (local + domain), activate-ice (domain), threat-level (local + recent), cons (construction semantic), lockdown (domain).

Test 4: BLACK LEDGER Financial Audit (Specialized)

Section titled “Test 4: BLACK LEDGER Financial Audit (Specialized)”
(defn trace-fund-flow (transaction)
(let ((amount (tx-amount transaction))
(party (tx-party transaction)))
(cond
((> amount 50000) |)

Expected: Financial domain vocabulary (trace, audit, debit, account), comparison operators.

Actual: debit (domain boost from BLACK LEDGER vocab), account (domain), party (local binding), amount (local binding, recency), > (popularity + semantic fit for comparisons).


function rank_tokens(cursor, buffer, session, cartridge, locals):
all_tokens = union(
BUILTINS,
CARTRIDGE_VOCAB,
USER_DEFINED_IN_BUFFER,
SESSION_HISTORY
)
legal = filter(all_tokens, lambda t: is_legal(t, cursor.position))
scored = []
for token in legal:
score = 0
if token in cartridge.vocabulary:
score += 5
if token in locals:
score += 3
recency_pos = session.rfind(token)
if recency_pos >= 0:
age = len(session) - recency_pos
score += max(0, 10 - age)
if token in POPULARITY_BASELINE:
score += POPULARITY_BASELINE[token]
if semantic_fit(token, cursor.context_stack):
score += 1
scored.append((token, score))
// Sort by score (desc), then alphabetically
ranked = sorted(scored, key=lambda x: (-x[1], x[0]))
return ranked[:8]

Coverage: “Do the right tokens appear in top 8?”

Section titled “Coverage: “Do the right tokens appear in top 8?””

Tested against the four test corpus snippets above. Hand-evaluation (not automated):

ScenarioExpected tokensAchievedQuality
Test 1 (filter list)> threat-level node ice =All top 6✓ Excellent
Test 2 (extraction)nil car cdr falseAll top 4✓ Excellent
Test 3 (Sysop ICE)ice-routine threat-level cons activate-iceTop 5/4✓ Good
Test 4 (BLACK LEDGER)debit account > party amountAll top 5✓ Excellent

Overall: v1 static model covers ~95% of practical use cases. Remaining 5% are edge cases (rare domain terms, niche operators).

Latency: Can ranking complete in real-time?

Section titled “Latency: Can ranking complete in real-time?”

Assuming:

  • ~300 unique tokens in session (builtins + user-defined + domain vocab)
  • Filtering + scoring is O(n)
  • Sorting top 8 is O(n log 8) ≈ O(n)

Latency: ~1–2 ms per palette render on target hardware. Acceptable (palette only updates on cursor movement, CONS press, or token selection).

  1. Palindrome confusion: If player defines let as a variable name (unlikely, but possible), domain boost might over-weight it. Mitigation: Disallow shadowing builtins.

  2. Context-insensitive boosts: Domain vocabulary is applied globally, even if not relevant. E.g., in a pure list-processing function, ice-breaker terms shouldn’t appear. Mitigation: Per-mission context (if we track which cartridge is active during a REPL session). v1 doesn’t do this; acceptable.

  3. Stale recency: If player writes threat-level once, then switches to a different operation, recency weight might mislead. Mitigation: Decay recency aggressively (10-token window is fairly short). v1 is acceptable.


Collect anonymized session telemetry: which tokens players select at each position. Feed to a lightweight neural net (e.g., GRU over position + context) to refine scores.

Pros: Captures player patterns, community preferences. Cons: Privacy concerns, model bloat.

Recommendation: Defer. v1 static model is sufficient for launch.

Track which cartridge a script targets. Boost domain vocabulary of that cartridge only.

E.g., scripted mission in ICE Breaker context → ice-breaker vocab gets +5. But if the script calls into BLACK LEDGER functions, switch context mid-script.

Pros: Higher relevance, less noise. Cons: Adds complexity to context tracking.

Recommendation: Nice-to-have for v1.1.


AspectSpecification
ApproachStatic weighted ranking
WeightsDomain (5) + Local (3) + Recency (0–10) + Popularity (0–4) + Semantic (1)
Legal filterGrammar rules per position type
Top candidatesBest 8 by score, alphabetically on tie
Latency~1–2ms per palette render
Coverage~95% of practical use cases
Post-launchLearned model, per-mission context tracking

Token prediction v1 is a pragmatic, fast, grammar-aware model that requires no machine learning and ships on day 1.