Problem
phantom_loop ticks bypass the entire intelligence stack - no evolved config, no memory recall, no post-session evolution, no cross-model judges. Each tick is a naive LLM call with no benefit from anything Phantom has learned.
The loop is a context-window management strategy, not a reason to skip the intelligence layer. Interactive sessions get the full stack; loops - the primary mechanism for long-running autonomous work - get none of it.
Proposed fix (3 phases)
Phase 1: Inject evolved config + memory into tick prompts
Each tick gets persona, domain knowledge, error recovery strategies, and recalled memories. Zero extra LLM calls (one local embedding query per tick for memory recall).
- Extend
RunnerDeps with optional memoryContextBuilder, evolvedConfig, roleTemplate
- Wire them from
src/index.ts (same instances already in the router scope)
- Extend
buildTickPrompt() to inject evolved config sections before the goal and memory context before the state file
- Call
contextBuilder.build(loop.goal) per tick for memory recall
Modified: src/loop/runner.ts, src/loop/prompt.ts, src/index.ts
Phase 2: Post-loop evolution and memory consolidation
After a loop finishes, synthesize a SessionSummary from accumulated tick transcripts and feed it through afterSession() and consolidateSessionWithLLM(). Phantom learns from autonomous work, not just interactive conversations.
- Accumulate tick prompt/response pairs in-memory during the run
- On finalize, build a
SessionSummary (loop status maps to outcome: done->success, stopped->abandoned, budget_exceeded/failed->failure)
- Call
evolution.afterSession(summary) - the pipeline is completely channel-agnostic, no changes needed
- Call
consolidateSessionWithLLM() to store the run as a vector-backed episode
Modified: src/loop/runner.ts, src/index.ts
Phase 3: Mid-loop critique checkpoints
For long loops (10+ ticks), run a Sonnet 4.6 judge every N ticks to detect drift before the budget burns out. This is the only phase that adds a new LLM call.
- New
src/loop/critique.ts module: reads state file + tick history, asks quality judge if the loop is making progress or stuck
- Configurable
checkpoint_interval (default 5 ticks, disabled for short loops)
- Critique injected into next tick prompt as "Reviewer feedback" section
Modified: src/loop/runner.ts, src/loop/prompt.ts
New: src/loop/critique.ts
Key notes
Problem
phantom_loopticks bypass the entire intelligence stack - no evolved config, no memory recall, no post-session evolution, no cross-model judges. Each tick is a naive LLM call with no benefit from anything Phantom has learned.The loop is a context-window management strategy, not a reason to skip the intelligence layer. Interactive sessions get the full stack; loops - the primary mechanism for long-running autonomous work - get none of it.
Proposed fix (3 phases)
Phase 1: Inject evolved config + memory into tick prompts
Each tick gets persona, domain knowledge, error recovery strategies, and recalled memories. Zero extra LLM calls (one local embedding query per tick for memory recall).
RunnerDepswith optionalmemoryContextBuilder,evolvedConfig,roleTemplatesrc/index.ts(same instances already in the router scope)buildTickPrompt()to inject evolved config sections before the goal and memory context before the state filecontextBuilder.build(loop.goal)per tick for memory recallModified:
src/loop/runner.ts,src/loop/prompt.ts,src/index.tsPhase 2: Post-loop evolution and memory consolidation
After a loop finishes, synthesize a
SessionSummaryfrom accumulated tick transcripts and feed it throughafterSession()andconsolidateSessionWithLLM(). Phantom learns from autonomous work, not just interactive conversations.SessionSummary(loop status maps to outcome: done->success, stopped->abandoned, budget_exceeded/failed->failure)evolution.afterSession(summary)- the pipeline is completely channel-agnostic, no changes neededconsolidateSessionWithLLM()to store the run as a vector-backed episodeModified:
src/loop/runner.ts,src/index.tsPhase 3: Mid-loop critique checkpoints
For long loops (10+ ticks), run a Sonnet 4.6 judge every N ticks to detect drift before the budget burns out. This is the only phase that adds a new LLM call.
src/loop/critique.tsmodule: reads state file + tick history, asks quality judge if the loop is making progress or stuckcheckpoint_interval(default 5 ticks, disabled for short loops)Modified:
src/loop/runner.ts,src/loop/prompt.tsNew:
src/loop/critique.tsKey notes
buildTickPrompt()insrc/loop/prompt.ts