Skip to content

feat: propagate W3C TRACEPARENT into all engine execution steps#38953

Merged
pelikhan merged 4 commits into
mainfrom
copilot/propagate-w3c-trace-context
Jun 12, 2026
Merged

feat: propagate W3C TRACEPARENT into all engine execution steps#38953
pelikhan merged 4 commits into
mainfrom
copilot/propagate-w3c-trace-context

Conversation

Copilot AI commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

gh-aw already writes GITHUB_AW_OTEL_TRACE_ID and GITHUB_AW_OTEL_PARENT_SPAN_ID to GITHUB_ENV and uses them to correctly parent MCP gateway spans under gh-aw.agent.setup — but never propagated this context to the engine process itself. Engine-internal spans (e.g. Claude Code's claude_code.interaction, claude_code.llm_request, claude_code.tool) were therefore orphaned from the workflow trace entirely.

Changes

  • New shared helper applyTraceContextEnvToMap in safe_outputs_env.go — injects TRACEPARENT into the engine step's env: map using a conditional GitHub Actions expression:

    TRACEPARENT: ${{ env.GITHUB_AW_OTEL_TRACE_ID != '' && env.GITHUB_AW_OTEL_PARENT_SPAN_ID != '' && format('00-{0}-{1}-01', env.GITHUB_AW_OTEL_TRACE_ID, env.GITHUB_AW_OTEL_PARENT_SPAN_ID) || '' }}

    Resolves to a valid W3C traceparent when both IDs are present; falls back to empty string (safe no-op) when OTEL is not configured.

  • Called from all 8 engines (claude, codex, copilot, crush, gemini, opencode, antigravity, pi) immediately after applySafeOutputEnvToMap, before user engine.env overrides are merged — so a user-supplied TRACEPARENT in engine.env naturally wins.

  • Golden files updated for all affected compiled-output snapshots.

The natural parent is the gh-aw.agent.setup span, keeping engine spans and MCP gateway spans as siblings under the same root — matching the existing gateway trace hierarchy.

Copilot AI and others added 2 commits June 12, 2026 22:37
Add applyTraceContextEnvToMap helper that injects the TRACEPARENT env
var into every engine execution step, derived from the
GITHUB_AW_OTEL_TRACE_ID and GITHUB_AW_OTEL_PARENT_SPAN_ID values already
written to GITHUB_ENV by the setup action.

The W3C traceparent value is formatted as:
  00-<trace-id>-<span-id>-01

A conditional GitHub Actions expression is used so the variable resolves
to an empty string (a safe no-op) when OTEL is not configured.

This allows engines that support inbound W3C trace context (e.g. Claude
Code running with -p / headless) to nest their internal spans
(interaction, LLM request, tool call) under the gh-aw.agent.setup span,
producing a complete end-to-end distributed trace in the OTEL backend.

Closes #38905

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…drop orphaned comment

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add support for W3C TRACEPARENT and TRACESTATE in agent engine feat: propagate W3C TRACEPARENT into all engine execution steps Jun 12, 2026
Copilot AI requested a review from pelikhan June 12, 2026 22:43
@pelikhan pelikhan marked this pull request as ready for review June 12, 2026 22:52
Copilot AI review requested due to automatic review settings June 12, 2026 22:52

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR ensures engine processes receive W3C trace context so engine-internal spans (e.g., Claude Code/Codex/Copilot tool and request spans) correctly attach to the overall workflow trace rather than appearing orphaned.

Changes:

  • Added a shared helper to inject TRACEPARENT into an engine step’s env: map using a GitHub Actions expression derived from GITHUB_AW_OTEL_TRACE_ID and GITHUB_AW_OTEL_PARENT_SPAN_ID.
  • Wired the helper into each engine’s execution-step env construction before engine.env overrides are merged.
  • Updated golden workflow fixtures and added unit tests covering the helper behavior.
Show a summary per file
File Description
pkg/workflow/safe_outputs_env.go Adds applyTraceContextEnvToMap helper to inject TRACEPARENT expression into env maps.
pkg/workflow/safe_output_helpers_test.go Adds unit tests validating applyTraceContextEnvToMap behavior and expression shape.
pkg/workflow/claude_engine.go Applies trace context injection to Claude engine step env.
pkg/workflow/codex_engine.go Applies trace context injection to Codex engine step env.
pkg/workflow/copilot_engine_execution.go Applies trace context injection to Copilot engine execution env.
pkg/workflow/crush_engine.go Applies trace context injection to Crush engine step env.
pkg/workflow/gemini_engine.go Applies trace context injection to Gemini engine step env.
pkg/workflow/opencode_engine.go Applies trace context injection to OpenCode engine step env.
pkg/workflow/pi_engine.go Applies trace context injection to Pi engine step env.
pkg/workflow/antigravity_engine.go Applies trace context injection to Antigravity engine step env.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/with-imports.golden Updates compiled fixture to include TRACEPARENT in engine env.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/smoke-copilot.golden Updates compiled fixture to include TRACEPARENT in engine env.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/playwright-cli-mode.golden Updates compiled fixture to include TRACEPARENT in engine env.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/basic-copilot.golden Updates compiled fixture to include TRACEPARENT in engine env.
pkg/workflow/testdata/TestWasmGolden_AllEngines/pi.golden Updates golden to include TRACEPARENT for Pi engine env.
pkg/workflow/testdata/TestWasmGolden_AllEngines/gemini.golden Updates golden to include TRACEPARENT for Gemini engine env.
pkg/workflow/testdata/TestWasmGolden_AllEngines/copilot.golden Updates golden to include TRACEPARENT for Copilot engine env.
pkg/workflow/testdata/TestWasmGolden_AllEngines/codex.golden Updates golden to include TRACEPARENT for Codex engine env.
pkg/workflow/testdata/TestWasmGolden_AllEngines/claude.golden Updates golden to include TRACEPARENT for Claude engine env.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 19/19 changed files
  • Comments generated: 0

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel completed test quality analysis.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (>100 new lines in pkg/) but does not have a linked Architecture Decision Record (ADR).

📄 Draft ADR committed: docs/adr/38953-propagate-w3c-trace-context-to-engines.md — review and complete it before merging.

🔒 This PR cannot merge until an ADR is linked in the PR body.

📋 What to do next
  1. Review the draft ADR committed to your branch — it was generated from the PR diff.
  2. Complete the missing sections — add context the AI could not infer, refine the decision rationale, and confirm the alternatives you actually considered.
  3. Commit the finalized ADR to docs/adr/ on your branch.
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-38953: Propagate W3C Trace Context into Engine Execution Steps

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

❓ Why ADRs Matter

ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you for capturing the trade-offs behind a cross-cutting change like trace-context propagation across all eight engines.

📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number.

🔒 This PR cannot merge until an ADR is linked in the PR body.

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · 80.6 AIC · ⌖ 9.69 AIC · ⊞ 13.8K ·

@github-actions

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 90/100 — Excellent

Analyzed 3 test(s) (3 sub-cases of TestApplyTraceContextEnvToMap): 3 design, 0 implementation, 0 guideline violation(s).

📊 Metrics & Test Classification (3 tests analyzed)
Metric Value
New/modified tests analyzed 3
✅ Design tests (behavioral contracts) 3 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 3 (100%)
Duplicate test clusters 0
Test inflation detected ⚠️ Yes — 66 test lines / 25 production lines = 2.64× (−10 pts)
🚨 Coding-guideline violations 0

Test Classification Details

Test File Classification Issues Detected
TestApplyTraceContextEnvToMap/adds_TRACEPARENT_to_empty_map pkg/workflow/safe_output_helpers_test.go:~297 ✅ Design None — verifies W3C expression shape, key count, and fallback semantics
TestApplyTraceContextEnvToMap/sets_TRACEPARENT_unconditionally pkg/workflow/safe_output_helpers_test.go:~330 ✅ Design None — verifies overwrite behavior (no conditional guard)
TestApplyTraceContextEnvToMap/preserves_existing_keys_in_map pkg/workflow/safe_output_helpers_test.go:~342 ✅ Design None — verifies non-destructive side-effect contract

Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 3 tests — unit (//go:build !integration) ✅
  • 🟨 JavaScript (*.test.cjs, *.test.js): 0 tests (none changed in this PR)

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). Score of 90/100 reflects a 10-point inflation penalty (test file grew 2.64× faster than production code), but all 3 sub-cases are high-value behavioral contracts with comprehensive assertions.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system’s behavioral contract — the promises it makes to its users and collaborators.

🧪 Test quality analysis by Test Quality Sentinel · 277 AIC · ⌖ 19.2 AIC · ⊞ 28.3K ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 90/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 3 sub-cases of TestApplyTraceContextEnvToMap are high-value behavioral contracts with no guideline violations.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /tdd and /grill-with-docs — requesting changes on test robustness and coverage gaps.

📋 Key Themes & Highlights

Key Themes

  • Fragile expression test: the "adds TRACEPARENT to empty map" subtest uses 5 separate strings.Contains fragment checks instead of a single exact assertion, so a structurally valid but semantically wrong expression can pass all checks
  • Missing integration test: the user-override-wins invariant (maps.Copy ordering) is documented but not tested at the engine level
  • Test coverage gap: antigravity, crush, and opencode engines have no compiled-YAML snapshot asserting that TRACEPARENT appears — regressions in those engines would be silent
  • Concern leakage: applyTraceContextEnvToMap (OTEL propagation) lives in safe_outputs_env.go (safe-output plumbing), which conflates two distinct concerns
  • Unguarded string key: "TRACEPARENT" is used as a bare string literal with no named constant

Positive Highlights

  • ✅ Clean DRY helper: one function shared across all 8 engines prevents drift
  • ✅ Safe fallback: || '' means engines that don't use OTEL see an empty string, not a malformed traceparent
  • ✅ Correct call-site ordering: placed before maps.Copy(user env) so user-supplied overrides win naturally
  • ✅ Excellent godoc on the new function — the intent and W3C version/flags rationale are clearly explained
  • ✅ All 9 golden files updated consistently

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 582.2 AIC · ⌖ 26 AIC · ⊞ 29.3K

}
// Fallback to empty string when OTEL not configured
if !strings.Contains(val, "|| ''") {
t.Errorf("TRACEPARENT expression should fall back to empty string when OTEL not configured, got: %q", val)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The five strings.Contains fragment checks do not pin the full expression — a value like "GITHUB_AW_OTEL_TRACE_ID-00--01-GITHUB_AW_OTEL_PARENT_SPAN_ID || ''" would pass all five assertions while being an invalid traceparent. A single exact equality check also serves as a living specification.

💡 Suggested replacement
const wantExpr = "${{ env.GITHUB_AW_OTEL_TRACE_ID != '' && env.GITHUB_AW_OTEL_PARENT_SPAN_ID != '' && format('00-{0}-{1}-01', env.GITHUB_AW_OTEL_TRACE_ID, env.GITHUB_AW_OTEL_PARENT_SPAN_ID) || '' }}"
if val != wantExpr {
    t.Errorf("unexpected TRACEPARENT expression:\ngot:  %q\nwant: %q", val, wantExpr)
}

This makes the test a precise specification of the GHA expression format, so any accidental mutation is immediately visible in the diff.

if env["TRACEPARENT"] == "00-custom-parent-01" {
t.Error("applyTraceContextEnvToMap should replace the prior value with the GHA expression")
}
})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The comment documents that user overrides win via maps.Copy(env, workflowData.EngineConfig.Env), but no test exercises this contract end-to-end in an engine. If the call-site ordering is ever accidentally reversed, the invariant breaks silently.

💡 Suggested test sketch

Add a test in e.g. claude_engine_test.go (or the existing table-driven engine-env tests) that compiles a workflow with engine.env: {TRACEPARENT: "00-user-override-01"} and asserts that the compiled YAML env block shows TRACEPARENT: 00-user-override-01 rather than the GHA expression. This pins the override-wins contract without relying on documentation alone.


var safeOutputsEnvLog = logger.New("workflow:safe_outputs_env")

// ========================================

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/grill-with-docs] applyTraceContextEnvToMap is about W3C OTEL trace propagation, not safe outputs. Placing it at the top of safe_outputs_env.go creates a misleading file-ownership boundary — a reader navigating to the safe-outputs utilities will encounter trace-context code with no obvious connection.

💡 Suggested refactor

Extract this function (and the // Trace Context Environment Variables section) into a new file pkg/workflow/trace_context_env.go. This keeps safe_outputs_env.go focused on a single concern and makes both files easier to discover when navigating the workflow package.

Alternatively, if co-location is intentional, move the section to the bottom of the file (after all safe-output helpers) so the file name still signals the primary content to the reader.

// Format: 00-<trace-id>-<span-id>-01 (W3C traceparent, sampled flag = 01).
// Only set when both OTEL IDs are non-empty; the conditional || '' fallback
// ensures engines see an empty string rather than a malformed traceparent.
env["TRACEPARENT"] = "${{ env.GITHUB_AW_OTEL_TRACE_ID != '' && env.GITHUB_AW_OTEL_PARENT_SPAN_ID != '' && format('00-{0}-{1}-01', env.GITHUB_AW_OTEL_TRACE_ID, env.GITHUB_AW_OTEL_PARENT_SPAN_ID) || '' }}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/grill-with-docs] The string "TRACEPARENT" is used directly as a map key without a named constant. If this key needs to be referenced elsewhere — for validation, documentation, or a future test that verifies the key is absent — there is no single source of truth.

💡 Suggested constant
// traceParentEnvKey is the W3C Trace Context env-var name consumed by engines
// that implement OTLP/W3C trace propagation (e.g. Claude Code, Codex).
const traceParentEnvKey = "TRACEPARENT"

Replace the string literal in applyTraceContextEnvToMap and the matching assertions in TestApplyTraceContextEnvToMap with traceParentEnvKey.

applySafeOutputEnvToMap(env, workflowData)

// Propagate W3C trace context so engine spans nest under the gh-aw.agent.setup span.
applyTraceContextEnvToMap(env)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] TestWasmGolden_AllEngines covers only 5 of 8 engines (copilot, claude, codex, gemini, pi). The antigravity, crush, and opencode engines are not in that test suite and have no compiled-YAML snapshot that would catch a missing or misplaced TRACEPARENT. The same gap exists in the corresponding engine files.

💡 Suggested gap-fill

Either:

  1. Add antigravity, crush, and opencode entries to TestWasmGolden_AllEngines (simplest — just extend the engines slice and run go test -update-golden), or
  2. Add a strings.Contains(stepContent, "TRACEPARENT") assertion to the existing unit tests in antigravity_engine_test.go, crush_engine_test.go, and opencode_engine_test.go.

Without this, accidentally removing applyTraceContextEnvToMap(env) from any of these three engines would not be caught by CI.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ LGTM — two non-blocking notes

The W3C trace-context propagation is correct: applyTraceContextEnvToMap is called consistently in all 8 engines, the GHA conditional expression handles the "OTEL not configured" fallback correctly, the call ordering relative to maps.Copy is right, and the golden tests lock in the generated YAML output.

Two non-blocking observations in inline comments:

  1. Misleading inline comment (safe_outputs_env.go L36) — "Only set when both OTEL IDs are non-empty" implies a Go-level guard; clarifying it refers to the runtime GHA expression would prevent a future "fix" that breaks propagation.
  2. Ordering contract not pinned by a test (safe_output_helpers_test.go L332) — the claim that user TRACEPARENT overrides win via maps.Copy is only documented, not exercised; worth closing before more engines are added.

🔎 Code quality review by PR Code Quality Reviewer · 497.4 AIC · ⌖ 13.4 AIC · ⊞ 17.3K

// Format: 00-<trace-id>-<span-id>-01 (W3C traceparent, sampled flag = 01).
// Only set when both OTEL IDs are non-empty; the conditional || '' fallback
// ensures engines see an empty string rather than a malformed traceparent.
env["TRACEPARENT"] = "${{ env.GITHUB_AW_OTEL_TRACE_ID != '' && env.GITHUB_AW_OTEL_PARENT_SPAN_ID != '' && format('00-{0}-{1}-01', env.GITHUB_AW_OTEL_TRACE_ID, env.GITHUB_AW_OTEL_PARENT_SPAN_ID) || '' }}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comment misdescribes when the Go assignment runs.

// Only set when both OTEL IDs are non-empty sounds like this block is guarded by a conditional, but env["TRACEPARENT"] is assigned unconditionally — the condition lives inside the GHA expression string, not in Go. A future reader could "fix" the seemingly wrong unconditional write by adding an if guard, which would silently break TRACEPARENT propagation when both IDs are present.

💡 Suggested wording
// The GHA expression evaluates to a non-empty traceparent only when both OTEL IDs
// are present at runtime; the || '' fallback ensures engines receive an empty string
// (treated as "no parent context") rather than a malformed header when OTEL is not configured.

This makes clear the conditional logic lives inside the GHA expression, not in a Go-level guard.


t.Run("sets TRACEPARENT unconditionally", func(t *testing.T) {
// applyTraceContextEnvToMap always sets TRACEPARENT. User overrides are applied
// afterwards by the engine via maps.Copy(env, workflowData.EngineConfig.Env),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering contract is documented here but not exercised by any test.

The comment claims user-supplied TRACEPARENT wins because maps.Copy(env, workflowData.EngineConfig.Env) runs after applyTraceContextEnvToMap in every engine. No test verifies this ordering, so a refactor that swaps the two calls would silently discard the user override without any test failure.

💡 How to close the gap

A table-driven engine test (or a single integration subtest in an existing engine test file) that sets workflowData.EngineConfig.Env["TRACEPARENT"] = "custom-value", calls GetExecutionSteps, and asserts the emitted step env contains "custom-value" would lock in this contract across all engines.

Alternatively, a golden fixture with a user-defined TRACEPARENT override would achieve the same coverage at the integration level.

This is non-blocking — the current code is correct — but the ordering invariant is worth pinning before the codebase grows more engines.

@pelikhan pelikhan merged commit e21bd64 into main Jun 12, 2026
29 checks passed
@pelikhan pelikhan deleted the copilot/propagate-w3c-trace-context branch June 12, 2026 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants