Skip to content

fix(trajectory): record subagent runs under their own agent_id (#33)#111

Merged
justrach merged 1 commit into
mainfrom
fix/subagent-trajectory-recording
May 20, 2026
Merged

fix(trajectory): record subagent runs under their own agent_id (#33)#111
justrach merged 1 commit into
mainfrom
fix/subagent-trajectory-recording

Conversation

@justrach

Copy link
Copy Markdown
Owner

Summary

Closes #33/trace and any other consumer of trajectory_events
previously saw only the root agent's tool calls. Every Task dispatch
silently bypassed the recorder because the child ForgeApp constructed
inside AgentExecutor::execute had no TrajectoryRecorder attached.

This wires the raw Arc<dyn TrajectoryRepo> from ForgeAPI::chat
through to AgentExecutor via constructor injection, then builds a
child TrajectoryRecorder scoped to the dispatched agent
((conversation_id, child_agent_id, Some(parent_agent_id))). Recursive
subagent dispatches inherit the repo and build their own recorders, so
the /trace tree walks naturally for arbitrary depth.

Acceptance (from #33)

SELECT DISTINCT agent_id
FROM trajectory_events
WHERE conversation_id = ?;

now returns the parent plus every dispatched child, and child rows
carry parent_agent_id linking back to the dispatcher.

Why constructor injection (not a trait bound)

The earlier Option-A sketch added a TrajectoryRepo bound to
AgentExecutor<S>, which propagated up into ForgeServices and the
api-layer generics. Option B (this PR) keeps the trait off the type
parameters and threads a concrete Arc<dyn TrajectoryRepo> through
the same builder chain that already plumbs the recorder. Smaller blast
radius, identical behaviour.

Plumbing path

  • AgentExecutor::with_trajectory_repo(...) — new builder
  • AgentExecutor::execute(...) — new parent_agent_id: AgentId param;
    builds a child recorder when the repo is set
  • ToolRegistry::with_trajectory_repo(...) — forwards to the
    embedded AgentExecutor
  • ToolRegistry Task-tool + agent-delegation callsites — capture
    agent.id and pass through as parent_agent_id
  • ForgeApp::with_trajectory_repo(...) — mirrors
    with_trajectory_recorder(...); both set together in
    ForgeAPI::chat
  • ForgeAPI::chat — calls both .with_trajectory_recorder(...) and
    .with_trajectory_repo(...)

Recording stays best-effort: every recorder error is swallowed with a
tracing::warn!; a telemetry failure never aborts an agent run.

Test plan

  • cargo check --workspace clean
  • cargo test -p forge_app — 736 tests pass, including the new
    parent_and_child_recorders_share_conversation_trajectory
    regression test which asserts the exact acceptance criterion at
    the recorder layer (distinct agent_ids in conversation, child
    rows reference parent, root rows have no parent_agent_id)
  • follow-up backport PR to release/0.1.9

Generated with Devin

Closes #33.

Before this change, the `/trace` command (and any tooling that queries
`trajectory_events`) saw only the root agent's tool calls. Every Task
dispatch that ran inside `AgentExecutor::execute` constructed a fresh
`ForgeApp` *without* a recorder, so the child orchestrator silently
bypassed the trajectory pipeline. The acceptance criterion was:

  SELECT DISTINCT agent_id
  FROM trajectory_events
  WHERE conversation_id = ?

…should return parent + every dispatched child, with child rows linked
back via `parent_agent_id`. It returned only the parent.

Fix: thread the raw `Arc<dyn TrajectoryRepo>` from `ForgeAPI::chat`
through `ForgeApp -> ToolRegistry -> AgentExecutor` via a constructor
injection (`with_trajectory_repo`), then have `AgentExecutor::execute`
build a *child* `TrajectoryRecorder` scoped to
`(conversation_id, child_agent_id, Some(parent_agent_id))` before
spinning up the child's `ForgeApp`. Recursive dispatches inherit the
repo, so grandchild runs build their own recorders against the same
backend and the `/trace` tree walks naturally.

Plumbing notes:
  - Added `AgentExecutor::with_trajectory_repo(...)` and a new
    `parent_agent_id` parameter on `AgentExecutor::execute(...)`
  - Updated both Task-tool + agent-delegation callsites in
    `ToolRegistry` to capture `agent.id` and pass it through
  - Added `ForgeApp::with_trajectory_repo(...)` mirroring the existing
    `with_trajectory_recorder(...)` chain; both are wired together in
    `ForgeAPI::chat`
  - Recording stays best-effort: every recorder error is swallowed
    with a `tracing::warn!`; a telemetry failure never aborts an
    agent run

Tests: added `parent_and_child_recorders_share_conversation_trajectory`
to `trajectory_recorder::tests` which asserts the exact acceptance
criterion at the recorder layer (distinct agent_ids in the
conversation, all child rows reference the parent, root rows have no
parent_agent_id). Full `cargo test -p forge_app` is green (736
tests).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: blackfloofie-a codegraff agent <265516171+blackfloofie@users.noreply.github.com>
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@github-actions github-actions Bot added the type: fix Iterations on existing features or infrastructure. label May 20, 2026
@justrach justrach merged commit 7a63c2c into main May 20, 2026
3 checks passed
@justrach justrach deleted the fix/subagent-trajectory-recording branch May 20, 2026 19:58
justrach added a commit that referenced this pull request May 20, 2026
…113)

Follow-up to #111. The earlier fix wired `Arc<dyn TrajectoryRepo>`
through `ForgeApp.tool_registry`, but the orchestrator's actual
tool-dispatch path goes through `services.call()` — the blanket
`AgentService::call` impl in `crate::agent` constructs a *fresh*
`ToolRegistry` per call without that repo. So even with the wiring
from #111, every Task dispatch still built a registry with
`AgentExecutor::trajectory_repo == None`, and the child run silently
bypassed recording exactly as before.

End-to-end verification (graff 0.1.9 release binary, real LLM
dispatch of `sage` from the `forge` parent):

  Before #111:           4 rows, 1 distinct agent_id, 0 parent_agent_id
  After #111:            same row count for parent conv, child rows
                         appeared but under a *new* conversation_id —
                         so `/trace <parent>` still showed only the
                         dispatch event
  After this commit:     8 rows under the parent conv_id, 2 distinct
                         agent_ids ({forge, sage}), every sage row
                         has parent_agent_id=forge, `/trace` renders
                         the child indented under the Task dispatch

Implementation:

1. `Services` trait: add `trajectory_repo()` default-`None` method.
   Overridden on `ForgeServices<F>` to return `self.infra.clone()`
   when `F: TrajectoryRepo`. This is the per-call hook the blanket
   AgentService impl needs to thread the repo into the registry
   it builds.

2. Blanket `AgentService::call` (`crate::agent`): chain
   `.with_trajectory_repo(self.trajectory_repo())` onto the
   per-call ToolRegistry. Now *every* tool dispatch — including the
   orchestrator's `services.call(...)` path — flows through a
   registry whose AgentExecutor can build child recorders.

3. `ToolCallContext` (`forge_domain`): add
   `parent_conversation_id: Option<ConversationId>`. Set by the
   orchestrator from its trajectory recorder's bound conv_id
   (falling back to its own conversation.id when recording is
   disabled). `AgentExecutor::execute` reads this via
   `parent_conversation_id_ref()` and uses it as the recorder's
   conversation_id, so child events land in the parent's
   trajectory bucket — and grandchildren land in the *root's* bucket
   because the propagation is recorder-id → ctx → next executor's
   recorder-id, not orchestrator's own conv_id.

4. `TrajectoryRecorder::conversation_id()`: small accessor so the
   orchestrator can read its own recorder's bound conv_id to set
   on the tool context (step 3).

All workspace tests pass (cargo test --workspace, 0 failures).
Recorder-layer regression test from #111 still green.

Generated with [Devin](https://cli.devin.ai/docs)

Co-authored-by: blackfloofie-a codegraff agent <265516171+blackfloofie@users.noreply.github.com>
Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
justrach added a commit that referenced this pull request May 20, 2026
…114)

Follow-up to #111. The earlier fix wired `Arc<dyn TrajectoryRepo>`
through `ForgeApp.tool_registry`, but the orchestrator's actual
tool-dispatch path goes through `services.call()` — the blanket
`AgentService::call` impl in `crate::agent` constructs a *fresh*
`ToolRegistry` per call without that repo. So even with the wiring
from #111, every Task dispatch still built a registry with
`AgentExecutor::trajectory_repo == None`, and the child run silently
bypassed recording exactly as before.

End-to-end verification (graff 0.1.9 release binary, real LLM
dispatch of `sage` from the `forge` parent):

  Before #111:           4 rows, 1 distinct agent_id, 0 parent_agent_id
  After #111:            same row count for parent conv, child rows
                         appeared but under a *new* conversation_id —
                         so `/trace <parent>` still showed only the
                         dispatch event
  After this commit:     8 rows under the parent conv_id, 2 distinct
                         agent_ids ({forge, sage}), every sage row
                         has parent_agent_id=forge, `/trace` renders
                         the child indented under the Task dispatch

Implementation:

1. `Services` trait: add `trajectory_repo()` default-`None` method.
   Overridden on `ForgeServices<F>` to return `self.infra.clone()`
   when `F: TrajectoryRepo`. This is the per-call hook the blanket
   AgentService impl needs to thread the repo into the registry
   it builds.

2. Blanket `AgentService::call` (`crate::agent`): chain
   `.with_trajectory_repo(self.trajectory_repo())` onto the
   per-call ToolRegistry. Now *every* tool dispatch — including the
   orchestrator's `services.call(...)` path — flows through a
   registry whose AgentExecutor can build child recorders.

3. `ToolCallContext` (`forge_domain`): add
   `parent_conversation_id: Option<ConversationId>`. Set by the
   orchestrator from its trajectory recorder's bound conv_id
   (falling back to its own conversation.id when recording is
   disabled). `AgentExecutor::execute` reads this via
   `parent_conversation_id_ref()` and uses it as the recorder's
   conversation_id, so child events land in the parent's
   trajectory bucket — and grandchildren land in the *root's* bucket
   because the propagation is recorder-id → ctx → next executor's
   recorder-id, not orchestrator's own conv_id.

4. `TrajectoryRecorder::conversation_id()`: small accessor so the
   orchestrator can read its own recorder's bound conv_id to set
   on the tool context (step 3).

All workspace tests pass (cargo test --workspace, 0 failures).
Recorder-layer regression test from #111 still green.

Generated with [Devin](https://cli.devin.ai/docs)

Co-authored-by: blackfloofie-a codegraff agent <265516171+blackfloofie@users.noreply.github.com>
Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: fix Iterations on existing features or infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(observability): subagent runs bypass TrajectoryRecorder

1 participant