graff-agent: Hermes-class harness in a NEW repo, prerequisites in codegraff

## Thesis

`graff-agent` is a **new, separate repo** that turns the existing `graff` binary into a Hermes-class personal agent — self-improving, multi-platform, scheduled, runtime-pluggable. It does this by depending on `@codegraff/sdk` (the N-API package on `release/0.1.53`) from npm and adding the surface area Hermes has and we don't.

This issue lives in `codegraff` because the **prerequisite work is here**: a small set of SDK additions need to ship before the new repo is buildable. Once those land, `graff-agent` proceeds independently.

---

## Hermes inventory (full deep dive)

Sourced from a structured pass over the [hermes-agent wiki](https://deepwiki.com/nousresearch/hermes-agent) — every line below is a real subsystem with concrete file paths.

### 1. Agent core
- `AIAgent` class in `run_agent.py` — the central orchestrator. `AIAgent.run_conversation()` is the main loop.
- Provider/model resolution in `hermes_cli/runtime_provider.py`.
- Auxiliary LLM client (`agent/auxiliary_client.py`) for vision, summarization, web extraction — i.e. side tasks use a cheap model.

### 2. Skills system (agentskills.io compatible)
- 70+ bundled skills, optional skills hub.
- `skill_manage` tool with actions `create`, `patch`, `edit`, `write_file`, `delete`, `remove_file` (see `acp_adapter/tools.py`).
- Skills live under `~/.hermes/skills/<name>/` with `SKILL.md` + optional `references/`, `templates/`, `scripts/` — same layout we already use under `.forge/skills/`.

### 3. Closed learning loop (the differentiator)
- **Autonomous skill creation:** triggered after `skills.creation_nudge_interval` tool-calling iterations. Agent calls `skill_manage action=create` and writes to `~/.hermes/skills/`.
- **Skill self-improvement during use:** triggered when user corrects style/workflow or a non-trivial technique emerges. Priority order: patch loaded skill → update umbrella skill → add support file under umbrella → create new umbrella. Patches use `old_string`/`new_string`.
- **Periodic memory nudges:** `_MEMORY_REVIEW_PROMPT`, `_SKILL_REVIEW_PROMPT`, `_COMBINED_REVIEW_PROMPT` injected periodically inside `AIAgent` (run_agent.py). `_summarize_background_review_actions` produces user-facing summaries.
- **FTS5 session search:** SQLite at `~/.hermes/state.db` with FTS5 over conversation content. `session_search` tool. Summarization step uses Gemini Flash.
- **Memory files:** `MEMORY.md` and `USER.md` orchestrated by `agent/memory_manager.py`.

### 4. User modeling — Honcho
- Plugin under `plugins/memory/honcho/` implementing the `MemoryProvider` ABC.
- Stores: session summary, user representation, AI peer card, persistent conclusions.
- Two-layer injection into system prompt every turn: base layer (cadence: `contextCadence`) + dialectic LLM layer (`dialecticCadence`).
- Tools: `honcho_profile`, `honcho_search`, `honcho_context`, `honcho_reasoning`, `honcho_conclude`.
- Cold-start vs warm-start prompt strategies based on prior session existence.

### 5. Messaging gateway
- `GatewayRunner` in `gateway/run.py` — long-running daemon, started/stopped via `hermes gateway start|stop`.
- `BasePlatformAdapter` in `gateway/platforms/base.py` defines the contract. Required: `connect`, `disconnect`, `send`, `send_typing`, `get_chat_info`. Optional: `send_document`, `send_voice`, `send_image_file`, etc.
- Adapters: `telegram.py`, `discord.py`, `slack.py`, plus WhatsApp, Signal, Email.
- **Session keys:** `agent:main:{platform}:{chat_type}:{chat_id}`, built via `gateway/session.build_session_key()`.
- **Per-user group sessions:** `group_sessions_per_user` (default `true`) — each sender in a group gets isolated state.
- **State persistence:** SQLite (`~/.hermes/state.db`) for metadata + JSONL transcripts in `~/.hermes/sessions/`.
- **Streaming:** progressive message edits when platform supports it (`SUPPORTS_MESSAGE_EDITING`), driven by `streaming.transport: edit`.
- **Voice:** auto-TTS gated by `voice.auto_tts` + per-chat `/voice on|off|tts`.

### 6. Cron scheduler
- `cron/jobs.py` + `cron/scheduler.py`.
- Jobs stored as JSON, support multiple schedule formats, can attach skills and scripts, deliver to any platform.

### 7. Subagent / parallelization
- `delegate_task` tool spawns isolated subagents that share parent's iteration budget (no runaway loops).
- `execute_code` tool: agent writes Python that calls Hermes tools via RPC — collapses multi-step pipelines into a single inference call.

### 8. Seven terminal backends
- All implement `BaseEnvironment` ABC under `tools/environments/`.
- Backends: `LocalEnvironment`, `DockerEnvironment`, `SSHEnvironment`, `SingularityEnvironment`, `ModalEnvironment`, plus Daytona and Vercel Sandbox.
- Selection driven by `terminal.backend` in `~/.hermes/config.yaml` (or `TERMINAL_ENV` env var). `_get_env_config()` + `_create_environment()` in `tools/terminal_tool.py`.

### 9. Trajectory recording + Atropos RL
- `environments/hermes_base_env.py` — `HermesAgentBaseEnv` extends Atropos `BaseEnv`.
- Trajectory recording substrate for evaluation and RL training.

### 10. Provider/model routing
- `hermes_cli/runtime_provider.py` resolves provider + model with cost/speed/quality preferences.
- Supports Nous Portal, OpenRouter, OpenAI, NVIDIA NIM, MiniMax, Kimi, z.ai, Hugging Face, custom endpoints.

### 11. Plugin system
- Four discovery sources: bundled (`<repo>/plugins/`), user (`~/.hermes/plugins/`), project (`./.hermes/plugins/` if `HERMES_ENABLE_PROJECT_PLUGINS=1`), and pip entry points (`hermes_agent.plugins`).
- `PluginManager` in `hermes_cli/plugins.py` runs `discover_and_load()`. Later sources override earlier on collision.
- Plugins register via `register(ctx)`:
  - Lifecycle hooks: `pre_tool_call`, `post_tool_call`, `pre_llm_call`, `post_llm_call`, `on_session_start`, `on_session_end`.
  - New tools: `ctx.register_tool(...)`.
  - CLI subcommands: `ctx.register_cli_command(...)`.
- Specialized: `MemoryProvider` (only one active), context engines (only one active).

### 12. Other notable subsystems
- **Context compression:** `agent/context_compressor.py` summarizes turns near limits.
- **Prompt caching:** `agent/prompt_caching.py` applies Anthropic cache breakpoints.
- **ACP server:** `acp_adapter/` exposes Hermes as IDE-native agent over stdio/JSON-RPC for VS Code, Zed, JetBrains.
- **Voice mode + TTS** as first-class.

---

## What this repo (codegraff) has today

| Subsystem | Status | File |
|---|---|---|
| Agent core / main loop | yes | `crates/forge_app/src/agent_executor.rs`, `app.rs` |
| Provider routing + DTO transforms | yes (richer than Hermes) | `crates/forge_app/src/dto/{anthropic,google,openai}/` |
| MCP host | yes | `crates/forge_infra` (paginated tools, structured results) |
| TUI | yes | `crates/codegraff-tui/src/main.rs` (~7900 LOC) |
| Slash command palette | yes | recent v0.1.5 work |
| Subagent w/ model override | yes | v0.1.5 |
| Trajectory recording | yes | `forge_app/src/trajectory_recorder.rs`, `forge_repo/src/trajectory/` |
| `/trace`, `/resume` | yes | v0.1.5 |
| Skills (read-only, agentskills.io layout) | yes | `forge_domain/src/skill.rs`, `forge_services/src/tool_services/skill.rs`, `.forge/skills/` |
| Context compression | yes | `forge_app/src/compact.rs` (931 LOC) |
| Prompt caching | yes | transforms in OpenAI/Anthropic DTO |
| N-API SDK | yes (on `release/0.1.53`) | `sdk/typescript/` |
| Headless mode | yes | `forge_main/src/main.rs` (`-p`, stdin, `--conversation-id`) |
| Conversation dump/load | yes | `forge_main/src/cli.rs` (`conversation dump/show`) |

---

## Gap analysis — Hermes vs codegraff today

| Hermes subsystem | codegraff status | Where it gets built |
|---|---|---|
| Agent core | ✓ | — |
| Skills (read-only) | ✓ | — |
| **Skill mutation API** | ✗ | **codegraff** (Part A) |
| **Skill self-improvement loop** | ✗ | **graff-agent** (uses mutation API) |
| **Autonomous skill creation nudges** | ✗ | **graff-agent** (review prompt injection) |
| **`MEMORY.md` / `USER.md` files** | ✗ | **graff-agent** (memory manager sidecar) |
| **FTS5 session search** | ✗ | **codegraff** (Part A: virtual table + SDK) |
| **Periodic review prompts (memory + skill)** | ✗ | **graff-agent** (cron-driven prompt injection) |
| **Honcho / dialectic user modeling** | ✗ | **graff-agent** (memory provider plugin) |
| **Pluggable memory provider trait** | ✗ | **codegraff** (Part A) |
| **Plugin system (lifecycle hooks, register_tool, register_cli)** | ✗ | **codegraff** (Part A — minimal first) |
| **Messaging gateway daemon** | ✗ | **graff-agent** (per-platform adapters) |
| **Per-platform adapters (Telegram, Discord, Slack, WhatsApp, Signal, Email)** | ✗ | **graff-agent** |
| **Session-key routing (`agent:main:{platform}:{chat_type}:{chat_id}`)** | ✗ | **graff-agent** |
| **Group-chat per-user sessions** | ✗ | **graff-agent** |
| **Streaming message edits per platform** | ✗ | **graff-agent** |
| **Voice memo transcription + auto-TTS** | ✗ | **graff-agent** |
| **Cron scheduler** | ✗ | **graff-agent** |
| **Pluggable execution environments (BaseEnvironment trait)** | ✗ | **codegraff** (trait) + **graff-agent** (Docker/SSH/Modal/Daytona/Vercel adapters) |
| **`delegate_task` parallelism** | partial (subagents exist) | **codegraff** polish (already mostly here) |
| **`execute_code` tool (Python RPC into agent tools)** | ✗ | **graff-agent** (Node/TS sidecar) |
| **ACP server (IDE integration)** | ✗ | **codegraff** (separate roadmap) |
| **Auxiliary LLM client (cheap model for side tasks)** | partial | **codegraff** (formalize) |
| **Trajectory subscription stream** | partial (write-only via recorder) | **codegraff** (Part A — broadcast → SDK) |
| **Pending-nudges queue** | ✗ | **codegraff** (Part A — table + SDK) |
| **Conversation export/import (JSONL transcripts)** | partial (JSON dump exists) | **codegraff** (already close) |

---

## Two repos, clear separation

```
┌──────────────────────────────────────┐    ┌──────────────────────────────────────┐
│  codegraff  (this repo)              │    │  graff-agent  (NEW separate repo)    │
│                                      │    │                                      │
│  Rust workspace:                     │    │  TypeScript / Node monorepo:         │
│   - graff binary (TUI)               │    │   - depends on @codegraff/sdk        │
│   - @codegraff/sdk (N-API)           │──▶ │     from npm — no Rust toolchain     │
│   - SQLite + diesel + FTS5           │ npm│     required to develop              │
│   - MCP host                         │    │                                      │
│   - skill mutation API               │    │  Packages:                           │
│   - trajectory broadcast             │    │   - graff-gateway (per-platform)     │
│   - pending_nudges queue             │    │   - graff-cron                       │
│   - memory_provider trait            │    │   - graff-memd  (learning loop)      │
│   - plugin system (minimal)          │    │   - graff-honcho (memory provider)   │
│   - exec_environment trait           │    │   - graff-runtimes (Docker/SSH/...)  │
│                                      │    │   - graff-shared                     │
│  Stays focused on agent core +       │    │                                      │
│  the substrate everything plugs into.│    │  Ships its own CLI (`graff-agent`)   │
└──────────────────────────────────────┘    │  that orchestrates the daemons.      │
                                            └──────────────────────────────────────┘
```

**Why split:**
- `graff-agent` developers don't need a Rust toolchain — `npm install` pulls prebuilt N-API binaries.
- `codegraff` keeps shipping the core and SDK on its own cadence; `graff-agent` ships gateway/cron/memd independently.
- A bug in the Telegram adapter never blocks a release of the core agent.
- TS ecosystem (Bun, npm, platform SDKs that mostly target JS) without compromising the Rust monorepo.
- Issues, releases, roadmaps stay scoped.

---

## Part A — work that lands in `codegraff` (this repo)

These are the prerequisites that make `graff-agent` buildable. Each becomes a follow-up issue.

### A.1 — SDK additions (`sdk/typescript/src/lib.rs` + `wire.rs`)
- [ ] **Trajectory subscription stream** — `graff.trajectory.subscribe(conversationId): AsyncIterable<TrajectoryEvent>`. Tokio broadcast → N-API ThreadsafeFunction.
- [ ] **Skill mutation API** — `graff.skills.create | update | patch | writeFile | delete | removeFile`. Mirrors Hermes' `skill_manage` actions exactly.
- [ ] **Compact pipeline exposed** — `graff.conversations.compact(cid): Promise<Digest>` reusing `forge_app/src/compact.rs`.
- [ ] **Pending-nudges queue** — `graff.nudges.enqueue(cid, message)`.
- [ ] **User-profile API** — `graff.user.facts.{list, upsert, delete}`.
- [ ] **Recall API** — `graff.search.recall(query, opts): Promise<RecallHit[]>` over the FTS5 table.
- [ ] **Context-engine + memory-provider hooks** — register-time interface so a memory provider (e.g. graff-honcho) can inject system-prompt context every turn (`onSystemPromptAssemble`).
- [ ] **Lifecycle hooks** — `onPreToolCall`, `onPostToolCall`, `onPreLLMCall`, `onPostLLMCall`, `onSessionStart`, `onSessionEnd`. Sidecars subscribe; this is the minimal plugin system Part 1.

### A.2 — Backing changes in `forge_app` / `forge_repo`
- [ ] `pending_nudges` table + poll point in the conversation loop.
- [ ] `user_profile` table.
- [ ] FTS5 virtual table over `trajectory_events` (or a derived `messages` table) + diesel migration.
- [ ] `MemoryProvider` trait in `forge_domain` + injection point in system-prompt assembly (mirrors Hermes' Layer-3 Honcho block).
- [ ] `ExecutionEnvironment` trait + `Local` impl. (Docker/SSH/Modal/etc. land in `graff-agent`.)
- [ ] Skill repository write path (`SkillRepository::write`, `delete`, `patch`). YAML frontmatter parsing for progressive disclosure.

### A.3 — Release
- [ ] Wire `npm publish` step in `.github/workflows/sdk-typescript.yml` (CI already builds the matrix; assemble job stops short of publish).
- [ ] First published release: `@codegraff/sdk@0.2.0`.

### A.4 — `graff` CLI surface (small)
- [ ] `graff conversation export <cid>` → JSONL transcript (Hermes-style `~/.hermes/sessions/`).
- [ ] `graff doctor` for environment + provider sanity (Hermes parity).

---

## Part B — `graff-agent` repo (new, future)

Layout once Part A ships:

```
graff-agent/
├── package.json                # pnpm workspaces root
├── packages/
│   ├── graff-shared/           # session keys, SQLite schema, types
│   ├── graff-memd/             # closed learning loop daemon
│   ├── graff-gateway/          # platform adapters
│   │   ├── src/platforms/
│   │   │   ├── base.ts         # PlatformAdapter abstract
│   │   │   ├── telegram.ts
│   │   │   ├── discord.ts
│   │   │   ├── slack.ts
│   │   │   ├── whatsapp.ts
│   │   │   ├── signal.ts
│   │   │   └── email.ts
│   │   └── src/runner.ts       # GatewayRunner equivalent
│   ├── graff-cron/             # schedules.toml + scheduler
│   ├── graff-honcho/           # memory provider plugin
│   ├── graff-runtimes/         # Docker, SSH, Modal, Daytona, Vercel
│   └── graff-agent/            # `graff-agent` CLI: start/stop/setup
├── docs/
└── examples/
```

### B.1 — `graff-memd` (closed learning loop)
Mirrors Hermes' nudge + review system, file-by-file analog:

| Hermes file | graff-memd equivalent |
|---|---|
| `_MEMORY_REVIEW_PROMPT` in `run_agent.py` | `packages/graff-memd/src/prompts/memory-review.ts` |
| `_SKILL_REVIEW_PROMPT` | `prompts/skill-review.ts` |
| `_COMBINED_REVIEW_PROMPT` | `prompts/combined-review.ts` |
| `_summarize_background_review_actions` | `digest/summarize.ts` |
| `agent/context_compressor.py` | reuse `graff.conversations.compact()` from SDK |
| `agent/memory_manager.py` | `memory/manager.ts` — owns `MEMORY.md`, `USER.md` files in `~/.graff/` |

Loop: subscribe to trajectory events → on `TaskComplete`, run review prompt via `runAgent({ model: cheapModel })` → if agent calls skill mutation tools, they hit the SDK API → on cron tick, inject `pending_nudges` for review prompts.

### B.2 — `graff-gateway`
Direct port of Hermes' `gateway/`:
- `PlatformAdapter` abstract (matches Hermes' `BasePlatformAdapter`).
- Session key format identical: `agent:main:{platform}:{chat_type}:{chat_id}`.
- Per-platform: webhook (preferred) + long-poll fallback. Streaming message edits where supported. Voice-in via Whisper.
- SQLite at `~/.graff/gateway.db` for `(platform, chat_id) → conversation_id` mapping; main conversation state stays in `@codegraff/sdk`'s SQLite.
- Order: Telegram → Discord → Slack → WhatsApp → Signal → Email.

### B.3 — `graff-cron`
- `~/.graff/cron/jobs.json` (Hermes uses JSON not TOML — match for portability).
- Multiple schedule formats (cron, every-N, at-time).
- Jobs can attach `skill: <name>` + `script: <path>` + `delivery: {platform, chat_id}`.
- Output captured to `~/.graff/cron/runs/<job_id>/<run_id>.jsonl`.

### B.4 — `graff-honcho`
- Implements the codegraff `MemoryProvider` trait (added in Part A) via the SDK plugin hooks.
- Stores: session summary, user representation, peer card, persistent conclusions (mirrors Hermes).
- Tools registered via plugin: `honcho_profile`, `honcho_search`, `honcho_context`, `honcho_reasoning`, `honcho_conclude`.
- Two-layer system-prompt injection (base + dialectic).

### B.5 — `graff-runtimes`
- TS interface `ExecutionEnvironment` (matches the codegraff Rust trait, talks to it via SDK).
- Adapters: `Docker`, `SSH`, `Singularity`, `Modal`, `Daytona`, `VercelSandbox`. (Local stays in codegraff.)
- Each adapter is its own subpackage so users only install what they need.

### B.6 — `graff-agent` CLI
Mirrors Hermes' top-level CLI:
- `graff-agent` (start interactive) — actually shells out to `graff` for the TUI.
- `graff-agent gateway start|stop|setup`
- `graff-agent cron list|add|run|delete`
- `graff-agent skills hub` (community hub fetch)
- `graff-agent doctor`
- `graff-agent setup` (full wizard)

---

## Roadmap

**Phase 0 — codegraff Part A** (this issue, broken into ~10 sub-issues)
Land all SDK + backing changes; cut `@codegraff/sdk@0.2.0` to npm.

**Phase 1 — `graff-agent` repo bootstrap**
Empty repo, pnpm workspaces, CI, `@codegraff/sdk@0.2.0` integrated, smoke test.

**Phase 2 — `graff-memd` v0**
- FTS5 recall via SDK
- Session-end digester (compact pipeline)
- Skill self-improvement loop (review-gated)
- Memory + skill review nudges
- `MEMORY.md` / `USER.md` files

**Phase 3 — `graff-gateway` Telegram only**
- Adapter contract + runner
- Webhook + streaming edits + voice-in
- Cross-platform session-key map

**Phase 4 — `graff-cron`**

**Phase 5 — More platforms** (Discord, Slack, WhatsApp, Signal, Email)

**Phase 6 — `graff-honcho` memory provider**

**Phase 7 — `graff-runtimes`** (Docker first, then SSH, then sandbox providers)

**Phase 8 — `execute_code` tool, ACP server** (parallel tracks once core lands)

---

## Open questions (carried)

- `graff-memd` language: TS via SDK (consistency, faster iteration) vs Rust (zero N-API overhead, direct DB). **Default: TS.** Rewrite later if profiling demands.
- Skill format: stay 100% on agentskills.io spec, or extend with `version` + `confidence` for the self-improvement loop? **Decide when skill mutation API lands.**
- Memory provider activation: only-one-active (Hermes' rule) vs stacked? Mirror Hermes for now.
- Should the gateway's per-platform SQLite live in `~/.graff/` or share `@codegraff/sdk`'s DB? Probably separate, joined by `conversation_id`.

## Out of scope here

- Copying Hermes code — `graff-agent` is an independent re-implementation atop `graff`.
- RL trajectory generation / Atropos environments — recording substrate exists; training is its own workstream.
- Building any `graff-agent` package in this issue — that work happens in the new repo. **This issue tracks only Part A.**

---

## Sub-issues (filed)

Tracked here for execution order. All target `release/0.1.53`.

### P0 — substrate (do first; everything else depends on these)
- [ ] #35 — Skill mutation API: write path + frontmatter parser + SDK `skills.{create,update,patch,writeFile,delete,removeFile}`
- [ ] #36 — FTS5 session search: `messages_fts` virtual table + SDK `search.recall`
- [ ] #37 — Pending-nudges queue: `pending_nudges` table + conversation-loop poll point + SDK `nudges.enqueue`
- [ ] #38 — User profile facts: `user_facts` table + SDK `user.facts.{list,upsert,get,delete}`

### P1 — live data flow (graff-memd needs these to react in real time)
- [ ] #39 — Trajectory subscription stream: tokio broadcast + SDK `trajectory.subscribe`
- [ ] #40 — Compact pipeline exposed: SDK `conversations.compact`

### P2 — extensibility (gateway / cron / memd / honcho / runtimes plug in here)
- [ ] #41 — MemoryProvider trait + system-prompt injection hook + SDK `plugins.registerMemoryProvider`
- [ ] #42 — Plugin lifecycle hooks (pre/post tool, pre/post LLM, session start/end)
- [ ] #43 — ExecutionEnvironment trait + Local impl + SDK `exec.registerEnvironment`

### P3 — release gate
- [ ] #44 — npm publish wiring for `@codegraff/sdk@0.2.0`

### P4 — CLI polish
- [ ] #45 — `graff conversation export <cid>` (JSONL) + `graff doctor`

Once all of P0–P3 land and `@codegraff/sdk@0.2.0` is on npm, the new `graff-agent` repo is unblocked and Phase 1 begins there.


Hermes subsystem	codegraff status	Where it gets built
Agent core	✓	—
Skills (read-only)	✓	—
Skill mutation API	✗	codegraff (Part A)
Skill self-improvement loop	✗	graff-agent (uses mutation API)
Autonomous skill creation nudges	✗	graff-agent (review prompt injection)
`MEMORY.md` / `USER.md` files	✗	graff-agent (memory manager sidecar)
FTS5 session search	✗	codegraff (Part A: virtual table + SDK)
Periodic review prompts (memory + skill)	✗	graff-agent (cron-driven prompt injection)
Honcho / dialectic user modeling	✗	graff-agent (memory provider plugin)
Pluggable memory provider trait	✗	codegraff (Part A)
Plugin system (lifecycle hooks, register_tool, register_cli)	✗	codegraff (Part A — minimal first)
Messaging gateway daemon	✗	graff-agent (per-platform adapters)
Per-platform adapters (Telegram, Discord, Slack, WhatsApp, Signal, Email)	✗	graff-agent
Session-key routing (`agent:main:{platform}:{chat_type}:{chat_id}`)	✗	graff-agent
Group-chat per-user sessions	✗	graff-agent
Streaming message edits per platform	✗	graff-agent
Voice memo transcription + auto-TTS	✗	graff-agent
Cron scheduler	✗	graff-agent
Pluggable execution environments (BaseEnvironment trait)	✗	codegraff (trait) + graff-agent (Docker/SSH/Modal/Daytona/Vercel adapters)
`delegate_task` parallelism	partial (subagents exist)	codegraff polish (already mostly here)
`execute_code` tool (Python RPC into agent tools)	✗	graff-agent (Node/TS sidecar)
ACP server (IDE integration)	✗	codegraff (separate roadmap)
Auxiliary LLM client (cheap model for side tasks)	partial	codegraff (formalize)
Trajectory subscription stream	partial (write-only via recorder)	codegraff (Part A — broadcast → SDK)
Pending-nudges queue	✗	codegraff (Part A — table + SDK)
Conversation export/import (JSONL transcripts)	partial (JSON dump exists)	codegraff (already close)

Hermes file	graff-memd equivalent
`_MEMORY_REVIEW_PROMPT` in `run_agent.py`	`packages/graff-memd/src/prompts/memory-review.ts`
`_SKILL_REVIEW_PROMPT`	`prompts/skill-review.ts`
`_COMBINED_REVIEW_PROMPT`	`prompts/combined-review.ts`
`_summarize_background_review_actions`	`digest/summarize.ts`
`agent/context_compressor.py`	reuse `graff.conversations.compact()` from SDK
`agent/memory_manager.py`	`memory/manager.ts` — owns `MEMORY.md`, `USER.md` files in `~/.graff/`

Subsystem	Status	File
Agent core / main loop	yes	`crates/forge_app/src/agent_executor.rs`, `app.rs`
Provider routing + DTO transforms	yes (richer than Hermes)	`crates/forge_app/src/dto/{anthropic,google,openai}/`
MCP host	yes	`crates/forge_infra` (paginated tools, structured results)
TUI	yes	`crates/codegraff-tui/src/main.rs` (~7900 LOC)
Slash command palette	yes	recent v0.1.5 work
Subagent w/ model override	yes	v0.1.5
Trajectory recording	yes	`forge_app/src/trajectory_recorder.rs`, `forge_repo/src/trajectory/`
`/trace`, `/resume`	yes	v0.1.5
Skills (read-only, agentskills.io layout)	yes	`forge_domain/src/skill.rs`, `forge_services/src/tool_services/skill.rs`, `.forge/skills/`
Context compression	yes	`forge_app/src/compact.rs` (931 LOC)
Prompt caching	yes	transforms in OpenAI/Anthropic DTO
N-API SDK	yes (on `release/0.1.53`)	`sdk/typescript/`
Headless mode	yes	`forge_main/src/main.rs` (`-p`, stdin, `--conversation-id`)
Conversation dump/load	yes	`forge_main/src/cli.rs` (`conversation dump/show`)

graff-agent: Hermes-class harness in a NEW repo, prerequisites in codegraff #34

Description

Thesis

Hermes inventory (full deep dive)

1. Agent core

2. Skills system (agentskills.io compatible)

3. Closed learning loop (the differentiator)

4. User modeling — Honcho

5. Messaging gateway

6. Cron scheduler

7. Subagent / parallelization

8. Seven terminal backends

9. Trajectory recording + Atropos RL

10. Provider/model routing

11. Plugin system

12. Other notable subsystems

What this repo (codegraff) has today

Gap analysis — Hermes vs codegraff today

Two repos, clear separation

Part A — work that lands in codegraff (this repo)

A.1 — SDK additions (sdk/typescript/src/lib.rs + wire.rs)

A.2 — Backing changes in forge_app / forge_repo

A.3 — Release

A.4 — graff CLI surface (small)

Part B — graff-agent repo (new, future)

B.1 — graff-memd (closed learning loop)

B.2 — graff-gateway

B.3 — graff-cron

B.4 — graff-honcho

B.5 — graff-runtimes

B.6 — graff-agent CLI

Roadmap

Open questions (carried)

Out of scope here

Sub-issues (filed)

P0 — substrate (do first; everything else depends on these)

P1 — live data flow (graff-memd needs these to react in real time)

P2 — extensibility (gateway / cron / memd / honcho / runtimes plug in here)

P3 — release gate

P4 — CLI polish

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Part A — work that lands in `codegraff` (this repo)

A.1 — SDK additions (`sdk/typescript/src/lib.rs` + `wire.rs`)

A.2 — Backing changes in `forge_app` / `forge_repo`

A.4 — `graff` CLI surface (small)

Part B — `graff-agent` repo (new, future)

B.1 — `graff-memd` (closed learning loop)

B.2 — `graff-gateway`

B.3 — `graff-cron`

B.4 — `graff-honcho`

B.5 — `graff-runtimes`

B.6 — `graff-agent` CLI