feat: propagate W3C TRACEPARENT into all engine execution steps#38953
Conversation
Add applyTraceContextEnvToMap helper that injects the TRACEPARENT env var into every engine execution step, derived from the GITHUB_AW_OTEL_TRACE_ID and GITHUB_AW_OTEL_PARENT_SPAN_ID values already written to GITHUB_ENV by the setup action. The W3C traceparent value is formatted as: 00-<trace-id>-<span-id>-01 A conditional GitHub Actions expression is used so the variable resolves to an empty string (a safe no-op) when OTEL is not configured. This allows engines that support inbound W3C trace context (e.g. Claude Code running with -p / headless) to nest their internal spans (interaction, LLM request, tool call) under the gh-aw.agent.setup span, producing a complete end-to-end distributed trace in the OTEL backend. Closes #38905 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…drop orphaned comment Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR ensures engine processes receive W3C trace context so engine-internal spans (e.g., Claude Code/Codex/Copilot tool and request spans) correctly attach to the overall workflow trace rather than appearing orphaned.
Changes:
- Added a shared helper to inject
TRACEPARENTinto an engine step’senv:map using a GitHub Actions expression derived fromGITHUB_AW_OTEL_TRACE_IDandGITHUB_AW_OTEL_PARENT_SPAN_ID. - Wired the helper into each engine’s execution-step env construction before
engine.envoverrides are merged. - Updated golden workflow fixtures and added unit tests covering the helper behavior.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/safe_outputs_env.go | Adds applyTraceContextEnvToMap helper to inject TRACEPARENT expression into env maps. |
| pkg/workflow/safe_output_helpers_test.go | Adds unit tests validating applyTraceContextEnvToMap behavior and expression shape. |
| pkg/workflow/claude_engine.go | Applies trace context injection to Claude engine step env. |
| pkg/workflow/codex_engine.go | Applies trace context injection to Codex engine step env. |
| pkg/workflow/copilot_engine_execution.go | Applies trace context injection to Copilot engine execution env. |
| pkg/workflow/crush_engine.go | Applies trace context injection to Crush engine step env. |
| pkg/workflow/gemini_engine.go | Applies trace context injection to Gemini engine step env. |
| pkg/workflow/opencode_engine.go | Applies trace context injection to OpenCode engine step env. |
| pkg/workflow/pi_engine.go | Applies trace context injection to Pi engine step env. |
| pkg/workflow/antigravity_engine.go | Applies trace context injection to Antigravity engine step env. |
| pkg/workflow/testdata/TestWasmGolden_CompileFixtures/with-imports.golden | Updates compiled fixture to include TRACEPARENT in engine env. |
| pkg/workflow/testdata/TestWasmGolden_CompileFixtures/smoke-copilot.golden | Updates compiled fixture to include TRACEPARENT in engine env. |
| pkg/workflow/testdata/TestWasmGolden_CompileFixtures/playwright-cli-mode.golden | Updates compiled fixture to include TRACEPARENT in engine env. |
| pkg/workflow/testdata/TestWasmGolden_CompileFixtures/basic-copilot.golden | Updates compiled fixture to include TRACEPARENT in engine env. |
| pkg/workflow/testdata/TestWasmGolden_AllEngines/pi.golden | Updates golden to include TRACEPARENT for Pi engine env. |
| pkg/workflow/testdata/TestWasmGolden_AllEngines/gemini.golden | Updates golden to include TRACEPARENT for Gemini engine env. |
| pkg/workflow/testdata/TestWasmGolden_AllEngines/copilot.golden | Updates golden to include TRACEPARENT for Copilot engine env. |
| pkg/workflow/testdata/TestWasmGolden_AllEngines/codex.golden | Updates golden to include TRACEPARENT for Codex engine env. |
| pkg/workflow/testdata/TestWasmGolden_AllEngines/claude.golden | Updates golden to include TRACEPARENT for Claude engine env. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 19/19 changed files
- Comments generated: 0
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. |
|
🧪 Test Quality Sentinel completed test quality analysis. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (>100 new lines in 📄 Draft ADR committed:
📋 What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. ❓ Why ADRs MatterADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you for capturing the trade-offs behind a cross-cutting change like trace-context propagation across all eight engines. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
|
🧪 Test Quality Sentinel Report✅ Test Quality Score: 90/100 — Excellent
📊 Metrics & Test Classification (3 tests analyzed)
Test Classification Details
Language SupportTests analyzed:
Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system’s behavioral contract — the promises it makes to its users and collaborators.
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /tdd and /grill-with-docs — requesting changes on test robustness and coverage gaps.
📋 Key Themes & Highlights
Key Themes
- Fragile expression test: the "adds TRACEPARENT to empty map" subtest uses 5 separate
strings.Containsfragment checks instead of a single exact assertion, so a structurally valid but semantically wrong expression can pass all checks - Missing integration test: the user-override-wins invariant (
maps.Copyordering) is documented but not tested at the engine level - Test coverage gap:
antigravity,crush, andopencodeengines have no compiled-YAML snapshot asserting thatTRACEPARENTappears — regressions in those engines would be silent - Concern leakage:
applyTraceContextEnvToMap(OTEL propagation) lives insafe_outputs_env.go(safe-output plumbing), which conflates two distinct concerns - Unguarded string key:
"TRACEPARENT"is used as a bare string literal with no named constant
Positive Highlights
- ✅ Clean DRY helper: one function shared across all 8 engines prevents drift
- ✅ Safe fallback:
|| ''means engines that don't use OTEL see an empty string, not a malformed traceparent - ✅ Correct call-site ordering: placed before
maps.Copy(user env)so user-supplied overrides win naturally - ✅ Excellent godoc on the new function — the intent and W3C version/flags rationale are clearly explained
- ✅ All 9 golden files updated consistently
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 582.2 AIC · ⌖ 26 AIC · ⊞ 29.3K
| } | ||
| // Fallback to empty string when OTEL not configured | ||
| if !strings.Contains(val, "|| ''") { | ||
| t.Errorf("TRACEPARENT expression should fall back to empty string when OTEL not configured, got: %q", val) |
There was a problem hiding this comment.
[/tdd] The five strings.Contains fragment checks do not pin the full expression — a value like "GITHUB_AW_OTEL_TRACE_ID-00--01-GITHUB_AW_OTEL_PARENT_SPAN_ID || ''" would pass all five assertions while being an invalid traceparent. A single exact equality check also serves as a living specification.
💡 Suggested replacement
const wantExpr = "${{ env.GITHUB_AW_OTEL_TRACE_ID != '' && env.GITHUB_AW_OTEL_PARENT_SPAN_ID != '' && format('00-{0}-{1}-01', env.GITHUB_AW_OTEL_TRACE_ID, env.GITHUB_AW_OTEL_PARENT_SPAN_ID) || '' }}"
if val != wantExpr {
t.Errorf("unexpected TRACEPARENT expression:\ngot: %q\nwant: %q", val, wantExpr)
}This makes the test a precise specification of the GHA expression format, so any accidental mutation is immediately visible in the diff.
| if env["TRACEPARENT"] == "00-custom-parent-01" { | ||
| t.Error("applyTraceContextEnvToMap should replace the prior value with the GHA expression") | ||
| } | ||
| }) |
There was a problem hiding this comment.
[/tdd] The comment documents that user overrides win via maps.Copy(env, workflowData.EngineConfig.Env), but no test exercises this contract end-to-end in an engine. If the call-site ordering is ever accidentally reversed, the invariant breaks silently.
💡 Suggested test sketch
Add a test in e.g. claude_engine_test.go (or the existing table-driven engine-env tests) that compiles a workflow with engine.env: {TRACEPARENT: "00-user-override-01"} and asserts that the compiled YAML env block shows TRACEPARENT: 00-user-override-01 rather than the GHA expression. This pins the override-wins contract without relying on documentation alone.
|
|
||
| var safeOutputsEnvLog = logger.New("workflow:safe_outputs_env") | ||
|
|
||
| // ======================================== |
There was a problem hiding this comment.
[/grill-with-docs] applyTraceContextEnvToMap is about W3C OTEL trace propagation, not safe outputs. Placing it at the top of safe_outputs_env.go creates a misleading file-ownership boundary — a reader navigating to the safe-outputs utilities will encounter trace-context code with no obvious connection.
💡 Suggested refactor
Extract this function (and the // Trace Context Environment Variables section) into a new file pkg/workflow/trace_context_env.go. This keeps safe_outputs_env.go focused on a single concern and makes both files easier to discover when navigating the workflow package.
Alternatively, if co-location is intentional, move the section to the bottom of the file (after all safe-output helpers) so the file name still signals the primary content to the reader.
| // Format: 00-<trace-id>-<span-id>-01 (W3C traceparent, sampled flag = 01). | ||
| // Only set when both OTEL IDs are non-empty; the conditional || '' fallback | ||
| // ensures engines see an empty string rather than a malformed traceparent. | ||
| env["TRACEPARENT"] = "${{ env.GITHUB_AW_OTEL_TRACE_ID != '' && env.GITHUB_AW_OTEL_PARENT_SPAN_ID != '' && format('00-{0}-{1}-01', env.GITHUB_AW_OTEL_TRACE_ID, env.GITHUB_AW_OTEL_PARENT_SPAN_ID) || '' }}" |
There was a problem hiding this comment.
[/grill-with-docs] The string "TRACEPARENT" is used directly as a map key without a named constant. If this key needs to be referenced elsewhere — for validation, documentation, or a future test that verifies the key is absent — there is no single source of truth.
💡 Suggested constant
// traceParentEnvKey is the W3C Trace Context env-var name consumed by engines
// that implement OTLP/W3C trace propagation (e.g. Claude Code, Codex).
const traceParentEnvKey = "TRACEPARENT"Replace the string literal in applyTraceContextEnvToMap and the matching assertions in TestApplyTraceContextEnvToMap with traceParentEnvKey.
| applySafeOutputEnvToMap(env, workflowData) | ||
|
|
||
| // Propagate W3C trace context so engine spans nest under the gh-aw.agent.setup span. | ||
| applyTraceContextEnvToMap(env) |
There was a problem hiding this comment.
[/tdd] TestWasmGolden_AllEngines covers only 5 of 8 engines (copilot, claude, codex, gemini, pi). The antigravity, crush, and opencode engines are not in that test suite and have no compiled-YAML snapshot that would catch a missing or misplaced TRACEPARENT. The same gap exists in the corresponding engine files.
💡 Suggested gap-fill
Either:
- Add
antigravity,crush, andopencodeentries toTestWasmGolden_AllEngines(simplest — just extend theenginesslice and rungo test -update-golden), or - Add a
strings.Contains(stepContent, "TRACEPARENT")assertion to the existing unit tests inantigravity_engine_test.go,crush_engine_test.go, andopencode_engine_test.go.
Without this, accidentally removing applyTraceContextEnvToMap(env) from any of these three engines would not be caught by CI.
There was a problem hiding this comment.
✅ LGTM — two non-blocking notes
The W3C trace-context propagation is correct: applyTraceContextEnvToMap is called consistently in all 8 engines, the GHA conditional expression handles the "OTEL not configured" fallback correctly, the call ordering relative to maps.Copy is right, and the golden tests lock in the generated YAML output.
Two non-blocking observations in inline comments:
- Misleading inline comment (
safe_outputs_env.goL36) —"Only set when both OTEL IDs are non-empty"implies a Go-level guard; clarifying it refers to the runtime GHA expression would prevent a future "fix" that breaks propagation. - Ordering contract not pinned by a test (
safe_output_helpers_test.goL332) — the claim that userTRACEPARENToverrides win viamaps.Copyis only documented, not exercised; worth closing before more engines are added.
🔎 Code quality review by PR Code Quality Reviewer · 497.4 AIC · ⌖ 13.4 AIC · ⊞ 17.3K
| // Format: 00-<trace-id>-<span-id>-01 (W3C traceparent, sampled flag = 01). | ||
| // Only set when both OTEL IDs are non-empty; the conditional || '' fallback | ||
| // ensures engines see an empty string rather than a malformed traceparent. | ||
| env["TRACEPARENT"] = "${{ env.GITHUB_AW_OTEL_TRACE_ID != '' && env.GITHUB_AW_OTEL_PARENT_SPAN_ID != '' && format('00-{0}-{1}-01', env.GITHUB_AW_OTEL_TRACE_ID, env.GITHUB_AW_OTEL_PARENT_SPAN_ID) || '' }}" |
There was a problem hiding this comment.
Inline comment misdescribes when the Go assignment runs.
// Only set when both OTEL IDs are non-empty sounds like this block is guarded by a conditional, but env["TRACEPARENT"] is assigned unconditionally — the condition lives inside the GHA expression string, not in Go. A future reader could "fix" the seemingly wrong unconditional write by adding an if guard, which would silently break TRACEPARENT propagation when both IDs are present.
💡 Suggested wording
// The GHA expression evaluates to a non-empty traceparent only when both OTEL IDs
// are present at runtime; the || '' fallback ensures engines receive an empty string
// (treated as "no parent context") rather than a malformed header when OTEL is not configured.This makes clear the conditional logic lives inside the GHA expression, not in a Go-level guard.
|
|
||
| t.Run("sets TRACEPARENT unconditionally", func(t *testing.T) { | ||
| // applyTraceContextEnvToMap always sets TRACEPARENT. User overrides are applied | ||
| // afterwards by the engine via maps.Copy(env, workflowData.EngineConfig.Env), |
There was a problem hiding this comment.
The ordering contract is documented here but not exercised by any test.
The comment claims user-supplied TRACEPARENT wins because maps.Copy(env, workflowData.EngineConfig.Env) runs after applyTraceContextEnvToMap in every engine. No test verifies this ordering, so a refactor that swaps the two calls would silently discard the user override without any test failure.
💡 How to close the gap
A table-driven engine test (or a single integration subtest in an existing engine test file) that sets workflowData.EngineConfig.Env["TRACEPARENT"] = "custom-value", calls GetExecutionSteps, and asserts the emitted step env contains "custom-value" would lock in this contract across all engines.
Alternatively, a golden fixture with a user-defined TRACEPARENT override would achieve the same coverage at the integration level.
This is non-blocking — the current code is correct — but the ordering invariant is worth pinning before the codebase grows more engines.
gh-aw already writes
GITHUB_AW_OTEL_TRACE_IDandGITHUB_AW_OTEL_PARENT_SPAN_IDtoGITHUB_ENVand uses them to correctly parent MCP gateway spans undergh-aw.agent.setup— but never propagated this context to the engine process itself. Engine-internal spans (e.g. Claude Code'sclaude_code.interaction,claude_code.llm_request,claude_code.tool) were therefore orphaned from the workflow trace entirely.Changes
New shared helper
applyTraceContextEnvToMapinsafe_outputs_env.go— injectsTRACEPARENTinto the engine step'senv:map using a conditional GitHub Actions expression:Resolves to a valid W3C traceparent when both IDs are present; falls back to empty string (safe no-op) when OTEL is not configured.
Called from all 8 engines (claude, codex, copilot, crush, gemini, opencode, antigravity, pi) immediately after
applySafeOutputEnvToMap, before userengine.envoverrides are merged — so a user-suppliedTRACEPARENTinengine.envnaturally wins.Golden files updated for all affected compiled-output snapshots.
The natural parent is the
gh-aw.agent.setupspan, keeping engine spans and MCP gateway spans as siblings under the same root — matching the existing gateway trace hierarchy.