# Inside the Agent `moo-agent` is the standalone CLI that signs into a DjangoMOO server as a persistent player and acts on its own. The user-facing story is in the [how-to guide](../how-to/moo-agent.md); the [first-agent tutorial](../tutorials/first-agent.md) walks through a starter run. This document is the explanation layer — *why* the agent is shaped the way it is. Most of the load-bearing detail lives here so that the modules in `moo/agent/` can stay short on inline commentary and just point back to the relevant section. ## Architecture at a Glance ``` ┌────────────────────────────────────────────────────────────────┐ │ cli.py — wires everything, owns the SIGTERM/reconnect loop │ └──────────────┬───────────────────────────────────────┬─────────┘ │ │ ▼ ▼ ┌──────────────────────────────┐ ┌────────────────────────┐ │ connection.py │ │ tui.py │ │ ├─ MooConnection (asyncssh)│ │ prompt-toolkit, two- │ │ ├─ MooSession (PREFIX/ │ │ pane scrollback + │ │ │ SUFFIX delimiter mode) │ │ live input field │ │ └─ iac.py (telnet IAC) │ └────────────┬───────────┘ └──────────────┬───────────────┘ │ │ on_output(text) │ operator input ▼ ▼ ┌────────────────────────────────────────────────────────────────┐ │ brain/__init__.py — Brain │ │ ├─ output_queue (asyncio.Queue) │ │ ├─ window (collections.deque, rolling output) │ │ ├─ script_queue (list[str], queued MOO commands) │ │ ├─ state (BrainState — current goal/plan/done flags) │ │ ├─ run() — perception-action loop │ │ ├─ _llm_cycle() — one inference + dispatch │ │ ├─ _wakeup_loop() — idle timer (timer-based agents) │ │ └─ _stall_check_loop() — token-chain stall recovery │ └──────────┬───────────────────────────┬─────────────────────────┘ │ │ ▼ ▼ ┌────────────────────┐ ┌────────────────────────────┐ │ brain/chain.py │ │ llm_client.py │ │ server-text │ │ provider selection, │ │ classifier; │ │ text scrub, LM Studio │ │ token chain │ │ text-fallback parsing │ │ relay & reconnect│ └────────────┬───────────────┘ └────────────────────┘ │ ▼ ┌──────────────────────────────────────────────┐ │ tools.py — ToolSpec / BUILDER_TOOLS │ │ typed tool harness; native or text mode │ └──────────────────────────────────────────────┘ ┌──────────────────────────────────────────────┐ │ soul.py — SOUL.md, SOUL.patch.md, baseline │ └──────────────────────────────────────────────┘ ┌──────────────────────────────────────────────┐ │ brain/plans.py — build & traversal plan I/O │ └──────────────────────────────────────────────┘ ``` `Brain` never imports from `moo.core` and never triggers Django setup. It only talks to the server through the `send_command` callback, and it only learns about the world through `enqueue_output(text)`. That keeps the agent a thin client of the MUD it inhabits, and lets the test suite drive `Brain` against captured fixtures. (perception-action-loop)= ## The Perception-Action Loop `Brain.run()` is one coroutine that drains the output queue, decides what (if anything) to do, and either fires the LLM or advances the script queue. The state machine has only a few moving parts but they interact in awkward ways because Celery, Kombu, and the SSH channel all deliver output on different schedules. ### The `output_queue → window` flow `enqueue_output()` is the single entry point for server text. It updates `_last_activity` (used by the wakeup timer) and pushes the line onto an `asyncio.Queue`. The run loop drains that queue with a 0.3-second timeout: - **Got a line:** append to `window`, classify it through `process_server_text` (chain relay, plan extraction, [Mail] suppression), then either dispatch a matching reflexive rule, advance the script queue, or arm `pending_llm`. - **Timed out (0.3 s of quiet):** flush `pending_drain`, `pending_llm`, or the fallback drain. This is the *quiet-period* edge that makes the rest of the loop work. ### Why drain after a quiet period A single MOO command can produce a burst of output — a `tell()` block, plus Celery `print()` preamble lines that arrive after the PREFIX/SUFFIX window of the *next* command. If the script queue advanced on every individual line, each preamble line would consume a script step and the agent would race through its plan in milliseconds. The fix is to set `pending_drain = True` whenever output arrives while the script queue is non-empty, and only call `_drain_script()` after 0.3 s of silence. By then the full burst has settled, and exactly one queued command fires per response cycle. Errors short-circuit this: if a server line matches `looks_like_error()`, the script queue is cleared immediately and control returns to the LLM. ### The fallback drain Some Celery-based verbs (`@create`, `@obvious`, `@alias`) emit their `print()` output *after* the PREFIX/SUFFIX window, so it never reaches `run()` at all. Without a fallback path, only the first command of a multi-step script executes; the rest wait until the wakeup timer fires a fresh LLM cycle and discards the queue. The fallback branch in `run()` checks for a queued script on every quiet tick and drains one step even when no output arrived. After the queue empties, an LLM cycle is queued so the agent can react to the result — unless the agent is an orchestrator or `timer_only`, in which case the cycle is suppressed. ### Pending-LLM gating When server output arrives and no rule matches, `pending_llm = True` arms an LLM cycle for the next quiet tick. Several conditions suppress that arming: - **Page-triggered, no goal yet** — agents with `idle_wakeup_seconds == 0` and no `current_goal` ignore non-page output and stay in `WAITING` until a page lands. See {ref}`wakeup-modes`. - **Orchestrator** — has no autonomous work; the token-chain relay in `chain.py` drives all of its commands deterministically. - **`timer_only`** — fires only via the wakeup timer; output is recorded but never triggers inference. - **`session_done`** — `done()` was called; status flips back to `READY` so the wakeup timer can still fire, but no LLM cycle runs until a fresh token page resets state. (script-queue)= ## The Script Queue `SCRIPT: a | b | c` directives, multi-step tool calls, and chain-relay commands all funnel into `_script_queue`. The queue is just a `list[str]` of raw MOO commands. `_drain_script()` pops one, writes it to the rolling window prefixed with `>`, and sends it. Loop detection (`_check_command_loop`) records the last 8 commands and injects an operator warning into the rolling window if any single command repeats 3+ times. ### Tool calls override text-mode scripts Some models (notably Gemma 4) emit *both* a structured tool call and a `SCRIPT:` line in the same response, which would execute the same command twice if the two queues were merged. `_dispatch_tool_calls` resolves this by *replacing* the SCRIPT-derived queue when any tool call translates to commands — native tool calls are authoritative. ### Done and `foreman_paged` guard `done()` is special: it has no MOO command output but it sets `session_done = True`, which suspends all further LLM cycles until a fresh token page resets state. Calling `done()` before the agent has paged Foreman with a "Token: …​ done." message would silently break the chain — Foreman would never receive the handoff and the chain would stall. The guard in `_dispatch_tool_calls` blocks `done()` until `foreman_paged` flips to True, and rewrites the agent's `current_goal` to a CRITICAL instruction telling it to send the page first. The bare-line fallback path applies the same guard; both paths read `foreman_paged` from `BrainState`. (llm-cycle)= ## One LLM Cycle `_llm_cycle()` is gated by a `Semaphore(1)` so rapid output never queues multiple in-flight calls — if a cycle is already running, the new one is silently skipped. The cycle: 1. **Build the system prompt** via `brain/prompt.py:build_system_prompt`. When the agent has tools wired up, the tool-mode preamble is used and the tool schemas carry the action vocabulary; otherwise the full text-mode directive grammar is emitted. 2. **Build the user message** via `brain/prompt.py:build_user_message` from `memory_summary`, `current_goal`, `current_plan`, the idle-wakeup counter, and the rolling window. 3. **Call the LLM** via `llm_client.call_llm` with up to 3 retries on 529 overload (5 s, 10 s, 20 s backoff). 4. **Parse the response** via `brain/directives.parse_llm_response` into an ordered list of `Directive` objects plus leftover thought lines. 5. **Apply directives** in source order. `GOAL:` updates `current_goal`, `PLAN:` rewrites the traversal plan, `SOUL_PATCH_*` appends to `SOUL.patch.md`, `BUILD_PLAN:` writes a YAML file under `builds/`, `SCRIPT:` populates the script queue, `DONE:` clears the goal, and `COMMAND:` is a one-shot dispatch. 6. **Dispatch tool calls** — dedupe consecutive duplicates (Gemma 4 sometimes emits the same call list twice), translate each through its `ToolSpec.translate`, and queue the results. See {ref}`tool-harness`. ### The bare-line fallback When neither a `COMMAND:` nor a `SCRIPT:` directive nor any tool calls were emitted, but a `current_goal` is set, `_try_bare_line_fallback` rescues a single-line response that *looks like* a MOO command. The heuristic is deliberately tight to avoid sending English prose to the server's parser: - The response must be exactly one non-empty line. - The line must not be a bare directive keyword (`GOAL`, `PLAN`, `DONE`, …) or a parenthetical narration (`(Wait mode)`). - The line either starts with a known MOO prefix (`@`, `say`, `page`, `look`, a compass direction) or is a short lowercase phrase (≤ 4 words, starting lowercase). Uppercase-first text is treated as English prose and discarded — `"Awaiting mason done page."` should never reach the server. - If the line parses as a tool call against the registered tool set, it is translated and queued through the tool harness. If even the fallback fails, an extra LLM cycle is queued (capped at 3 via `goal_only_count`) so models that split goal-setting and action across responses still get a chance to act. Orchestrators skip this — they have nothing to "act on" while waiting for a token holder. ### The goal-only re-cycle counter Some models (Gemma in particular) reliably emit a `GOAL:` line, then stop without an action. The counter trips one extra cycle each time a goal is set but no command is dispatched, capped at 3, so we don't enter an infinite ping-pong if the model is stuck. (wakeup-modes)= ## Wakeup Modes Agents fall into one of three operating modes, determined by config flags. ### Timer-based (`idle_wakeup_seconds > 0`) A background `_wakeup_loop` task fires an LLM cycle when the agent has been idle for `idle_wakeup_seconds`. Within 10 seconds of firing, the prompt flips to `SLEEPING` so the TUI can show countdown pressure. When the timer fires, the agent's `current_goal` is cleared (timer agents shouldn't loop on stale done/recap state), and optionally the rolling window is cleared as well. Reactive NPCs that need accumulated room context between wakeups can set `clear_window_on_wakeup = false`. The timer skips if the plan is fully exhausted *and* the agent has no current goal — at that point it has nothing left to do and would just invent extra work. ### Page-triggered (`idle_wakeup_seconds == 0`) Workers in the token chain (Mason, Tinker, Joiner, Harbinger) wait for a page from Foreman that hands them the token. They don't run a wakeup loop at all. The status flip in `_set_status` translates `READY` to `WAITING` so the prompt shows `waiting>` while idle. LLM cycles are suppressed unless the agent has a `current_goal` (token received, work in progress) or an incoming line is a page. This prevents the agent from burning tokens reasoning about server output that has nothing to do with its job. The {ref}`token-chain` mechanics arrange for the goal to be set automatically when a token page arrives. ### `timer_only` Set on Foreman. The wakeup timer is the *only* path that fires LLM cycles — output never arms `pending_llm`. This stops Foreman from over-reacting to incoming chain pages between its scheduled cycles. (stall-detection)= ## Stall Detection `_stall_check_loop` is a deterministic recovery path that bypasses the LLM entirely. It runs on Foreman (anywhere `stall_timeout_seconds > 0`) and re-pages the agent currently holding the token if it hasn't emitted a "done" page within the timeout. Before re-paging, the loop shells out to `agentmux cycle-age` (configured via `MOO_TOKEN_CHAIN_GROUP` and `MOO_AGENTMUX_PATH`) to ask whether the target agent is still inside a plausible LLM cycle. If the agent's elapsed time since its last log write is under `max(stall_s, 3 × p95)`, the re-page is suppressed — the agent is just slow, not deadlocked. This prevents Foreman from spamming an agent that's mid-inference on a slow local model. After firing, the dispatched timestamp resets so the next re-page fires one full timeout later (linear backoff, not exponential). (token-chain)= ## Token Chain Mechanics `brain/chain.py:process_server_text` is a pure function that runs on every inbound line. It classifies the line, mutates `BrainState` in place, and returns a `ChainActions` value telling Brain which scripts to queue and which thoughts to surface. Splitting it out of `Brain.run()` is what makes the relay logic testable against captured fixtures (see `tests/test_brain_chain.py`). ### Roles: orchestrator vs worker `is_orchestrator = bool(token_chain) and ssh.user not in token_chain` — an agent is the orchestrator when a chain is configured but the agent itself isn't a member. Workers (chain members) inherit `MOO_TOKEN_CHAIN` from the environment but must *not* relay; doing so would create an infinite self-page loop. ### Auto-start on connect When `text == "Connected"` and the orchestrator has no dispatched token yet, it pages the first agent in the chain with "Token: Foreman start." and records `token_dispatched_to`. No LLM call needed. ### Auto-relay When an incoming page contains "Token: …​ done.", the orchestrator looks up the sender's position in the chain and pages the next member (wrapping back to the first if the sender was last). Workers see the same line but skip relay because they're inside the chain. ### Auto-reconnect When a worker logs in mid-pass, it sends `Token: reconnected.` to Foreman. Foreman re-pages that agent — but only if no token is currently dispatched, or the dispatched target matches. This stops a batch startup from flooding Foreman with reconnect pages that each get a token handed back simultaneously. Workers themselves use `prior_goal_for_reconnect` to fire the reconnect page on their own connect event without waiting for an LLM cycle. ### Mailbox suppression `[Mail] From : ` lines are extracted, recorded into `memory_summary` as prior-session context, and *suppressed* from the rolling window. The line itself never reaches the LLM; only the parsed context does. This keeps the noise from `check_inbox` polling out of the prompt. ### Auto-extracted plans `divine()` returns a "Impressions surface…" header followed by indented ` (#NNN)` lines. Workers that need a traversal plan would otherwise have to format a `PLAN:` directive themselves; smaller models (Gemma) reliably stall on that step, setting a meta-goal like "prepare a plan" instead of emitting the directive. `process_server_text` extracts the room IDs directly into `current_plan`, so the agent can skip that step and go straight to teleporting to the first room. The extraction only fires when the plan is empty or was loaded from disk — it never overwrites an active plan from a token page or a fresh `BUILD_PLAN:`. (plan-persistence)= ## Plan Persistence `brain/plans.py` owns four free functions for plan I/O. Splitting them out of `Brain` lets the persistence logic be tested against a plain `BrainState` and a `tmp_path` directory. ### Build plans (Mason) `save_build_plan` accepts a `BUILD_PLAN:` payload, writes it as a datestamped YAML file to `builds/YYYY-MM-DD-HH-MM.yaml`, extracts top-level room names via the indent-aware regex in `directives.py`, and overrides `memory_summary` so the next LLM cycle starts *building* instead of re-planning. Only the first `BUILD_PLAN:` per session is accepted. If the plan is already populated (from a prior plan or a disk reload), subsequent `BUILD_PLAN:` directives are logged and ignored. The check has one exception: a plan made of only room IDs (`#128`, `#9`, …) is treated as visit-list context from a token page, and a real plan with room *names* is allowed to override it. ### Traversal plans (workers) `save_traversal_plan` writes `current_plan` to `builds/traversal_plan.txt` on every change. `load_traversal_plan` restores it on startup. Workers that don't emit `BUILD_PLAN:` (Tinker, Joiner, Harbinger) use this to resume their room list after a restart. `load_latest_build_plan` runs first; the traversal plan only loads if no build plan was found. Page-triggered agents always start cold and receive fresh room lists via the token, so the traversal plan is *not* loaded at construction time for them — a stale plan from a previous mission would let the LLM skip `divine()` on the next token pass and visit the wrong rooms. (soul-system)= ## The Soul System `soul.py` parses an agent's persona and operational rules from two files: - `SOUL.md` — the immutable core. Mission, persona, optional context, reflexive `Rules of Engagement`, `Verb Mapping` intent shorthands, and the `Tools` list. - `SOUL.patch.md` — append-only and agent-writable. The LLM emits `SOUL_PATCH_RULE:`, `SOUL_PATCH_VERB:`, and `SOUL_PATCH_NOTE:` directives that get appended via `append_patch_directive`. Notes document lessons learned without imposing a fixed response. If a `baseline.md` exists in the config directory's *parent*, its text is prepended to `SOUL.context` and any rules/verb mappings it contains are appended to the soul's lists. This is how the four tradesmen agents share a baseline persona while keeping per-agent specifics in their own `SOUL.md`. Markdown links in the `Context` section that resolve to local `.md` or `.txt` files are inlined verbatim. The agent's persona file can therefore pull in glossaries or shared playbooks without copy-paste. (connection-layer)= ## The Connection Layer `connection.py:MooConnection` opens an `asyncssh` channel with `TERM=xterm-256-basic`, which puts the django-moo shell into raw mode and enables IAC subnegotiation. The agent advertises itself as `moo-agent` via TTYPE/MTTS, accepts GMCP/MSSP/EOR/CHARSET, and refuses MSP. See {ref}`iac-layer` for the negotiation details. ### Surrogate-escape encoding The channel is configured with `errors="surrogateescape"` so 0xFF IAC bytes round-trip as `\udcff` Python str surrogates instead of raising `UnicodeDecodeError`. Outbound IAC reply bytes are encoded with the same mode and re-emitted by the channel verbatim. ### PREFIX/SUFFIX delimiter mode After a session is up, `MooSession.setup_delimiters(prefix, suffix)` switches the line buffer from "emit one line per `\n`" to "emit only the content between `>>MOO-START-{id}<<` and `>>MOO-END-{id}<<` markers." The delimiters are per-session (8-char hex of a fresh timestamp) so two agents on the same broker don't ever cross-talk. ### Why no suppress window during setup An earlier design suppressed all output between writing the setup commands and switching to delimiter mode, so the verbs' "Global output prefix set to…" confirmations would not pollute the agent's log. The cost was that any page or tell from another player landing in that window was silently extracted and dropped — Foreman's initial token dispatch routinely missed Joiner because the page arrived during Joiner's setup. The current design sends the setup commands in line mode (so confirmations and incoming pages both come through as visible server lines) and only flips to delimiter mode after settings have propagated. The setup confirmations are bounded (≤ 6 lines, once per session); a missed page costs minutes of stall recovery, so we choose the noise. ### Kombu broker latency Each `OUTPUTPREFIX` / `OUTPUTSUFFIX` / `a11y` verb publishes its session setting via Kombu, and the shell's `process_messages()` needs to drain the event into the server-side `_session_settings` dict before the *next* command's wrapping logic reads it. Kombu publish→consume has 200 ms+ of broker latency. The setup sequence sleeps 0.4 s between commands to give each setting time to land before the next command's response is wrapped. ### Preamble extraction in delimiter mode When delimiter mode finds a SUFFIX, it emits any complete preamble lines *before* the most recent PREFIX as individual lines. This captures `print()` output from a previous command that arrived after that command's suffix (Celery flush order). Trailing partial content between the last newline and the prefix marker is dropped — that's typically the server's interactive prompt (`>>>` in raw mode), which should never surface to the agent. ### Eager flush After the regular delimiter extraction, `_extract_delimited` eagerly flushes any complete lines that sit in the buffer ahead of the next pending PREFIX. These are `print()` confirmations from commands whose `tell()` output was empty — without the eager flush they would wait in the buffer until the next command, causing the agent to see no confirmation and retry the same command repeatedly. (iac-layer)= ## IAC (Telnet Subnegotiation) `iac.py` is the client-side mirror of the server's `moo/shell/iac.py`. It splits into three pieces: - **`IacParser`** — a byte-feed state machine that strips IAC sequences out of the data stream and emits parsed events (`("cmd", cmd, opt)`, `("sb", opt, payload)`, `("ga",)`, `("eor",)`). - **Encoders** (`encode_cmd`, `encode_sb`, `encode_ttype_is`, `encode_naws`, `encode_gmcp`, `encode_charset_request`) — produce the reply byte sequences. `encode_sb` doubles 0xFF in payloads per the telnet escaping rule. - **`AgentIacNegotiator`** — translates each parsed event into reply bytes and capability state changes. Side effects on negotiation completion (e.g. sending `Core.Hello` after GMCP enables) are emitted along with the immediate reply bytes. ### What we offer and accept - `_WE_OFFER = {OPT_TTYPE, OPT_NAWS, OPT_CHARSET}` — the agent enables these on its own side when the server asks (`DO X` → reply `WILL X`). - `_WE_ACCEPT_SERVER = {OPT_GMCP, OPT_MSSP, OPT_EOR_OPT, OPT_CHARSET}` — enabled on the server side when offered (`WILL X` → reply `DO X`). MSP is intentionally omitted — we can't play sounds. SGA is omitted on purpose: the server's `WONT SGA` is what enables `IAC GA` after each prompt, which is the prompt-boundary signal the agent reads. ### Loop suppression Servers that re-send `WILL`/`DO` when they see our `DO`/`WILL` (the django-moo server does this for accepted client options) would otherwise trigger an infinite ping-pong. The negotiator tracks already-enabled options on `capabilities` and replies only when state actually changes; already-refused options are tracked privately on `_refused_will` / `_refused_do` so repeat WILL/DO from the server are silently ignored without leaking sentinel keys into the public capabilities dict. ### TTYPE / MTTS handshake The TTYPE handshake is a three-stage loop: stage 1 returns the client name (`moo-agent`), stage 2 returns the terminal name (`XTERM-256COLOR`), stage 3 returns `MTTS `. The default MTTS bitfield advertises `ANSI | UTF-8 | 256-color | screen-reader` — the screen-reader bit is the truthful flag because the agent reads the output programmatically. After stage 3, any further `IAC SB TTYPE SEND` requests loop on the terminal name to signal we have nothing more to offer. ### GMCP handshake When GMCP enables, `_send_gmcp_handshake` emits `Core.Hello` (with the client name and version) and `Core.Supports.Set` advertising the packages the agent consumes (default: `Char 1`, `Room 1`, `Comm 1`, `MSSP 1`). The editor package is intentionally omitted — it requires programmatic save/cancel that's out of scope for the current MR. (llm-client)= ## The LLM Client `llm_client.py` is the provider-agnostic call wrapper. Three pieces live here: - `make_client(llm_config)` — picks the right SDK (`AsyncAnthropic`, `AsyncAnthropicBedrock`, or `AsyncOpenAI` against an LM Studio base URL). Brain holds a single client instance for the lifetime of the session so LM Studio can keep its KV cache warm across calls. - `parse_lm_studio_tool_calls(text, known_names)` — pure function. Four fallback strategies, tried in order, for extracting tool calls from plain-text output when LM Studio doesn't surface them through the OpenAI `tool_calls` field: 1. `{json}` XML blocks. 2. `` tags. 3. `TOOL: name arg=value` lines (via `parse_tool_line`). 4. Bare `name(k='v')` function calls validated against `known_names`. - `call_llm(...)` — the awaitable wrapper. For Anthropic/Bedrock, native tool use is requested when tools are non-empty. For LM Studio, structured `tool_calls` are tried first, then the text fallback. ### Special-token scrubbing Some local models (e.g. `gpt-oss` with Harmony templates) emit tokens like `<|channel>thought` or `<|im_start|>` into the assistant text. If these land in `memory_summary` or the rolling window, the next request to LM Studio fails with `Failed to parse input at pos 0: <|channel>...`. `_SPECIAL_TOKEN_RE` strips two forms: - `<|...|>` / `<|...>` — leading pipe, any content (e.g. `<|im_start|>`). - `` — trailing pipe only (e.g. ``). The scrub runs on every LLM response and on every line read from a prior session log (`session_log.py`). ### Observability `observability.py` wires the agent into [Pydantic Logfire](https://logfire.pydantic.dev). `setup_observability()` runs once at startup in `run_agent()`, before any LLM client is built — it calls `logfire.configure()` and then `instrument_anthropic()` / `instrument_openai()`, which patch the SDK classes globally. Because Instructor patches those same clients, every LLM call (and each Instructor re-ask retry) is traced with token usage, latency, and cost. `Brain._llm_cycle` opens a `logfire.span("llm_cycle")` around `_run_cycle_body`; the auto-instrumented LLM call nests under it through OpenTelemetry context, so one trace carries the goal, the LLM call, token/cost figures, and an `outcome` attribute (`dispatched`, `goal_only`, or `llm_failed`). Tracing is opt-in by environment variable: `configure()` uses `send_to_logfire="if-token-present"`, so traces ship only when `LOGFIRE_TOKEN` is set. Without it the calls are a local no-op. `console=False` keeps Logfire off stdout — the prompt_toolkit TUI would otherwise be corrupted. (tool-harness)= ## The Tool Harness `tools.py` defines `ToolParam`, `ToolSpec`, `LLMResponse`, and the `BUILDER_TOOLS` registry. A `ToolSpec` carries a name, description, typed parameter list, and a `translate(args) → list[str]` function. Translation keeps MOO command syntax out of the LLM's output path: the model says `dig(direction="north", room_name="The Library")` and the harness emits `@dig north to "The Library"`. ### Why `_norm_ref` exists LLMs routinely emit `target=22` or `obj=22` as tool args, which would translate to `@survey 22` / `@move 22 to ...`. The MOO parser then tries to look up an object literally named "22" in the current room and fails with `There is no '22' here.` `_norm_ref` rewrites bare positive integers to `#22` form at translation time, eliminating the entire class of error without burdening the agents with a guidance rule. Non-integer references (`#22`, `here`, `$player_start`, `"mahogany desk"`) are passed through unchanged. ### Schema flavors `to_anthropic_schema()` and `to_openai_schema()` produce the shapes each provider expects. When tools are active, the system prompt switches to `PATCH_INSTRUCTIONS_TOOLS_ACTIVE` so the LLM is told to call tools rather than emit free-form COMMAND/SCRIPT directives — the action vocabulary lives in the tool schemas. ### Three text-mode parsers `parse_tool_line` accepts three formats so that LM Studio fallback paths don't have to know which provider produced the text: - `TOOL: name(key="value" key2="value2")` — explicit prefix (the documented form). - `call:name{...}` / `tool_call:name{...}` / `tool_code:name(...)` — Gemma 4 native shape when LM Studio doesn't expose `tool_calls`. Gemma also wraps string values in `<|"|>...<|"|>` special tokens; `_strip_gemma_tokens` rewrites them to plain quotes before the key-value extractor runs. - `name(k="v", k2="v2")` — bare Python-style call. Only matched when a `known_names` set is supplied, so MOO commands that happen to contain parentheses don't get misidentified as tool calls. The argument regex (`_BARE_CALL_RE`) allows parentheses inside quoted strings (single or double), so values like `done(summary="Completed Gear Vault (#816)")` parse correctly. Without the quoted-string alternation the regex would stop at the first `)` inside the string and fail to match the whole call. ### Redundant-teleport suppression `_dispatch_tool_calls` and the bare-line fallback both inspect `teleport(destination=…)` calls and skip them when the destination already names the agent's current room (by `#N` id or name). The skip also pushes a synthetic line into the rolling window so the LLM sees authoritative feedback in the next cycle. Without that injection the silent skip produced no commands, no server output, and the `goal_only_count` re-cycle would just emit the same teleport call again on the next 1–3 follow-up cycles before stalling. (session-log)= ## Session Resume `session_log.py:read_prior_session` is the thin filesystem layer that lets a fresh run pick up where the previous one left off. Logs are named `YYYY-MM-DDTHH-MM-SS.log`, so lexicographic order equals chronological order. The function reads the most recent prior log, keeps only the entries whose kind is in `_RESUME_KINDS` (`action`, `server`, `goal`, `thought`, `server_error`), and returns the last 40 of those plus the most recent `[Goal] …​` line. A plan-exhausted marker (`[Plan] All planned rooms built.`) overrides the normal summary and replaces it with a hard instruction to call `done()` immediately. Otherwise, special-token scrubbing runs on every included line so a poisoned prior log can't re-poison the new session. `cli.py` then decides what to do with the result: - **Timer-based agents** discard both the prior summary and the prior goal — stale context causes them to skip mandatory first steps (e.g. mailmen skipping `@mail` listing). - **Page-triggered agents** discard the prior summary but keep the prior goal *only* to feed the auto-reconnect page mechanism. The goal is never set as `current_goal` — the agent always starts cold and waits for a fresh token page. (tui)= ## The TUI `tui.py` builds a prompt-toolkit full-screen application with two regions: a scrolling output pane on top and a single-line input field on the bottom. The status indicator on the input prompt (`ready`/`waiting`/`sleeping`/`thinking`) is updated by Brain via the `on_status_change` callback. The output pane uses a custom `_ScrollableOutputControl` that reports `cursor_position` at the last logical line when autoscrolling. In scroll mode (entered with Escape) the cursor tracks the viewport top, which — combined with directly setting `Window.vertical_scroll` in the key handlers — produces exact line-by-line and page scrolling. `window_height` is captured each render so key handlers can compute page jumps without calling any `render_info` API. Operator input from the TUI bypasses the rolling window's normal LLM-arming path: `enqueue_instruction` appends an `[Operator]:` line and immediately schedules an LLM cycle, because a direct instruction should always reach the LLM regardless of rule matches. ## Where to look next - For the directive grammar the LLM is taught: `brain/prompt.py` contains `PATCH_INSTRUCTIONS` (the LLM-facing reference document). - For the regex grammar that parses LLM responses: `brain/directives.py`. - For tool definitions: `tools.py:BUILDER_TOOLS`. - For the chain-relay test fixtures: `tests/test_brain_chain.py`. - For the LambdaCore-style server-side counterpart: see the django-moo docs at `docs/source/explanation/shell-internals.md` — the agent's PREFIX/SUFFIX delimiters and `a11y` settings are configured against that shell.