fix(api-proxy): use copilot_usage token_details for accurate cache split by lpcox · Pull Request #5253 · github/gh-aw-firewall

lpcox · 2026-06-18T15:52:37Z

Problem

For Claude models served through the GitHub Copilot OpenAI-compatible endpoint (/chat/completions via the api-proxy Copilot port), upstream reports a flattened usage object where prompt_tokens lumps fresh input together with cache-write tokens, and cache_creation_input_tokens is absent:

"usage": {
  "prompt_tokens": 16396,            // = input (3857) + cache_write (12539)
  "completion_tokens": 362,
  "prompt_tokens_details": { "cached_tokens": 0 }   // cache_read only
}

The authoritative per-type split lives only in the sibling copilot_usage.token_details:

"copilot_usage": { "token_details": [
  { "token_type": "input",       "token_count": 3857 },
  { "token_type": "cache_read",  "token_count": 0 },
  { "token_type": "cache_write", "token_count": 12539 },
  { "token_type": "output",      "token_count": 362 }
] }

The parser previously read only usage, so it recorded input_tokens = 16396 and cache_write_tokens = 0. The 12,539 cache-write tokens (billed at a higher rate than fresh input on Claude) were silently mis-counted as plain input — a cost-fidelity bug.

Fix

Add extractCopilotUsageBreakdown() to parse copilot_usage.token_details (top-level or nested under response) into normalized fields.
In extractUsageFromJson (non-streaming) and the OpenAI/Copilot SSE final-chunk branch, prefer this breakdown and drop the lumped prompt_tokens.
Plain OpenAI/Copilot responses without copilot_usage are unaffected.

Before → After (real run shape)

field	before	after
input_tokens	16396	3857
cache_write_tokens	0	12539
cache_read_tokens	0	0
output_tokens	362	362

Tests

16 new unit tests in token-tracker.parsing.test.js; full api-proxy suite green (1238 passed).

Scope note

Addresses the cache-split fidelity gap found while investigating the AIC=0 regression. prompt_tokens extraction itself provably works for this shape, so this is independent of the separate "no token-usage record at all" symptom (which still needs api-proxy container logs to pin down).

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Claude-via-Copilot responses report a flattened usage object where prompt_tokens lumps fresh input together with cache-write tokens, and cache_creation_input_tokens is absent. The authoritative per-type split (input / cache_read / cache_write / output) lives only in the sibling copilot_usage.token_details array. Parse copilot_usage.token_details (in both non-streaming JSON and the SSE final chunk) and prefer it over the lumped prompt_tokens so cache-write tokens are recorded and billed correctly instead of being mis-counted as plain input. Plain OpenAI responses without copilot_usage are unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-18T15:54:21Z

✅ Coverage Check Passed

Overall Coverage

Metric	Base	PR	Delta
Lines	97.57%	97.61%	📈 +0.04%
Statements	97.50%	97.54%	📈 +0.04%
Functions	98.84%	98.84%	➡️ +0.00%
Branches	92.95%	92.98%	📈 +0.03%

📁 Per-file Coverage Changes (1 files)

File	Lines (Before → After)	Statements (Before → After)
`src/workdir-setup.ts`	92.7% → 94.5% (+1.82%)	92.7% → 94.5% (+1.82%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot

Pull request overview

This PR fixes token accounting for Claude models served via the GitHub Copilot OpenAI-compatible /chat/completions endpoint by preferring the authoritative copilot_usage.token_details breakdown over the flattened usage.prompt_tokens, restoring correct cache-write vs fresh-input attribution in the api-proxy’s usage normalization.

Changes:

Add extractCopilotUsageBreakdown() to parse copilot_usage.token_details (top-level or response-nested) into normalized usage fields.
Integrate the Copilot breakdown into both non-streaming JSON parsing (extractUsageFromJson) and streaming final-chunk parsing (extractUsageFromSseLine), and drop lumped prompt_tokens when appropriate.
Add unit tests covering the breakdown extraction and integration paths.

Show a summary per file

File	Description
containers/api-proxy/token-parsers.js	Adds Copilot-specific usage breakdown parsing and integrates it into JSON + SSE parsing paths.
containers/api-proxy/token-tracker.js	Re-exports the new breakdown helper via the token-tracker facade for tests/consumers.
containers/api-proxy/token-tracker.parsing.test.js	Adds unit tests validating Copilot breakdown extraction and end-to-end normalization behavior.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 3/3 changed files
Comments generated: 3

+    const copilotBreakdown = extractCopilotUsageBreakdown(json);
+    if (copilotBreakdown) {
+      const merged = { ...(result.usage || {}), ...copilotBreakdown };
+      // Drop the lumped prompt_tokens so normalizeUsage uses the accurate
+      // input_tokens instead of input+cache_write.
+      if (copilotBreakdown.input_tokens !== undefined) {
+        delete merged.prompt_tokens;
+      }
+      result.usage = merged;
+    }


+      const copilotBreakdown = extractCopilotUsageBreakdown(json);
+      if (copilotBreakdown) {
+        result.usage = { ...result.usage, ...copilotBreakdown };
+        if (copilotBreakdown.input_tokens !== undefined) {
+          delete result.usage.prompt_tokens;
+        }
+      }


+  test('uses copilot_usage even when the flattened usage object is absent', () => {
+    const body = Buffer.from(JSON.stringify({
+      model: 'claude-sonnet-4.6',
+      copilot_usage: {
+        token_details: [
+          { token_type: 'input', token_count: 200 },
+          { token_type: 'output', token_count: 10 },
+          { token_type: 'cache_write', token_count: 99 },
+        ],
+      },
+    }));
+    expect(normalizeUsage(extractUsageFromJson(body).usage)).toEqual({
+      input_tokens: 200,
+      output_tokens: 10,
+      cache_read_tokens: 0,
+      cache_write_tokens: 99,
+      reasoning_tokens: 0,
+    });
+  });
+});


lpcox · 2026-06-18T15:57:20Z

@copilot address review feedback

… but no input When copilot_usage.token_details provides cache_write but omits input, prompt_tokens (= input + cache_write) would be kept alongside cache_creation_input_tokens, causing normalizeUsage to double-count cache_write. Fix: infer input_tokens = max(0, prompt_tokens - cache_creation_input_tokens) and drop prompt_tokens in both the non-streaming and streaming paths. Add regression tests for this edge case in both paths.

Copilot · 2026-06-18T16:01:43Z

@copilot address review feedback

Addressed in commit fix(api-proxy): infer input_tokens when copilot_usage has cache_write but no input.

Both parsing paths now handle the edge case: when copilot_usage.token_details provides cache_write but omits input, the code infers input_tokens = max(0, prompt_tokens - cache_creation_input_tokens) and drops prompt_tokens, preventing the double-count in normalizeUsage. Two regression tests lock in the behavior (non-streaming + streaming).

github-actions · 2026-06-18T16:06:14Z

✅ Copilot review passed with no inline comments.

@lpcox Add the ready-for-aw label to this PR to trigger agentic CI smoke tests.

github-actions · 2026-06-18T16:06:58Z

❌ Smoke Claude failed

No user request was provided in this turn — only system reminders and project context. No action taken.

github-actions · 2026-06-18T16:07:06Z

✅ Contribution Check completed successfully!

github-actions · 2026-06-18T16:08:19Z

📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅

github-actions · 2026-06-18T16:08:36Z

✅ Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓

github-actions · 2026-06-18T16:08:38Z

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

github-actions · 2026-06-18T16:08:42Z

🔌 Smoke Services — Service connectivity failed ⚠️

github-actions · 2026-06-18T16:08:50Z

❌ Smoke Copilot BYOK AOAI (api-key) reports failed. AOAI BYOK (api-key) mode investigation needed...

github-actions · 2026-06-18T16:08:54Z

Chroot tests failed Smoke Chroot failed - See logs for details.

github-actions · 2026-06-18T16:08:59Z

✅ Smoke Gemini completed. All facets verified. 💎

Smoke test completed with FAIL status. Connectivity and MCP tools were unavailable. File operations passed.

github-actions · 2026-06-18T16:09:00Z

✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟

github-actions · 2026-06-18T16:09:07Z

✅ Build Test Suite completed successfully!

github-actions · 2026-06-18T16:09:15Z

🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅

github-actions · 2026-06-18T16:09:21Z

❌ Smoke Copilot BYOK reports failed. BYOK mode investigation needed...

github-actions · 2026-06-18T16:15:27Z

Add comprehensive gVisor firewall comparison workflow: ✅
refactor: extract provider env var constants to a shared module: ✅
GitHub title check: ✅
Smoke-test file write/read: ✅
npm ci && npm run build: ✅
Overall: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

github-actions · 2026-06-18T16:15:37Z

🔥 Smoke Test: Copilot PAT Auth — FAIL

Test	Result
GitHub MCP connectivity	✅ Connected (PR list retrieved)
GitHub.com HTTP	✅ HTTP 200
File write/read	❌ Pre-step data unavailable (template vars unsubstituted)

Overall: FAIL — pre-step outputs not available; file test could not be verified.

Auth mode: PAT (COPILOT_GITHUB_TOKEN) | PR author: @lpcox

🔑 PAT report filed by Smoke Copilot PAT

github-actions · 2026-06-18T16:17:09Z

🔬 Smoke Test Results

PR: fix(api-proxy): use copilot_usage token_details for accurate cache split
Author: @lpcox

Test	Result
GitHub MCP connectivity	✅ PASS
GitHub.com HTTP	⚠️ N/A (pre-step outputs not injected)
File write/read	⚠️ N/A (pre-step outputs not injected)

Overall: PARTIAL — MCP confirmed working; pre-computed step outputs (SMOKE_HTTP_CODE, SMOKE_FILE_CONTENT, SMOKE_FILE_PATH) were not substituted (workflow template issue).

📰 BREAKING: Report filed by Smoke Copilot

github-actions · 2026-06-18T16:17:09Z

@lpcox Smoke Test Results:
GitHub MCP Testing: ✅
GitHub.com Connectivity: ✅
File Write/Read Test: ✅
BYOK Inference Test: ✅
Running in direct BYOK mode (github-oidc + AzureEntra + COPILOT_PROVIDER_BASE_URL) via api-proxy → Foundry (o4-mini-aw).
Overall Status: PASS

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)

github-actions · 2026-06-18T16:18:07Z

🔍 Smoke Test: API Proxy OpenTelemetry Tracing

Scenario	Result	Notes
1. Module Loading	✅	`otel.js` loads; exports `startRequestSpan`, `setTokenAttributes`, `setBudgetAttributes`, `endSpan`, `endSpanError`, `shutdown`, `isEnabled` + 7 test helpers; `isEnabled()` = `true`
2. Test Suite	✅	59 passed, 0 failed — 2 suites (`otel.test.js`, `otel-fanout.test.js`)
3. Env Var Forwarding	✅	`api-proxy-service-config.ts` forwards `GH_AW_OTLP_ENDPOINTS`, `OTEL_EXPORTER_OTLP_ENDPOINT`, `OTEL_EXPORTER_OTLP_HEADERS`, `GITHUB_AW_OTEL_TRACE_ID`, `GITHUB_AW_OTEL_PARENT_SPAN_ID`, `OTEL_SERVICE_NAME` to the api-proxy container
4. Token Tracker Integration	✅	`onUsage` callback present at line 283 of `token-tracker-http.js` with JSDoc at line 374
5. OTEL Diagnostics	✅	`FileSpanExporter` fallback active (no OTLP endpoint configured → writes to `/var/log/api-proxy/otel.jsonl`); no external OTLP export expected in this run

All scenarios pass. OTEL tracing integration is healthy on this PR.

📡 OTel tracing validated by Smoke OTel Tracing

github-actions · 2026-06-18T16:20:18Z

Smoke Test Results

GitHub MCP Testing: ❌ (Tools not found)
GitHub.com Connectivity: ❌ (SSL connect error 35)
File Writing Testing: ✅
Bash Tool Testing: ✅

Overall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

github-actions · 2026-06-18T16:32:49Z

🏗️ Build Test Suite Results

Ecosystem	Project	Build/Install	Tests	Status
Bun	elysia	✅	1/1 passed	✅ PASS
Bun	hono	✅	1/1 passed	✅ PASS
C++	fmt	✅	N/A	✅ PASS
C++	json	✅	N/A	✅ PASS
Deno	oak	N/A	1/1 passed	✅ PASS
Deno	std	N/A	1/1 passed	✅ PASS
.NET	hello-world	✅	N/A	✅ PASS
.NET	json-parse	✅	N/A	✅ PASS
Go	color	✅	1/1 passed	✅ PASS
Go	env	✅	1/1 passed	✅ PASS
Go	uuid	✅	1/1 passed	✅ PASS
Java	gson	✅	1/1 passed	✅ PASS
Java	caffeine	✅	1/1 passed	✅ PASS
Node.js	clsx	✅	passed	✅ PASS
Node.js	execa	✅	passed	✅ PASS
Node.js	p-limit	✅	passed	✅ PASS
Rust	fd	✅	1/1 passed	✅ PASS
Rust	zoxide	✅	1/1 passed	✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #5253 · ◷

Copilot AI review requested due to automatic review settings June 18, 2026 15:52

Copilot started reviewing on behalf of lpcox June 18, 2026 15:53 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Copilot started work on behalf of lpcox June 18, 2026 15:57 View session

Copilot finished work on behalf of lpcox June 18, 2026 16:02

lpcox mentioned this pull request Jun 18, 2026

fix(api-proxy): copy token-tracker-shared + otel modules into image (fixes AIC=0) #5254

Merged

lpcox added the ready-for-aw label Jun 18, 2026

lpcox temporarily deployed to aoai-model June 18, 2026 16:08 — with GitHub Actions Inactive

github-actions Bot added the smoke-codex label Jun 18, 2026

lpcox temporarily deployed to aoai-model June 18, 2026 16:15 — with GitHub Actions Inactive

github-actions Bot added the smoke-copilot-byok-aoai-entra label Jun 18, 2026

lpcox temporarily deployed to aoai-model June 18, 2026 16:17 — with GitHub Actions Inactive

github-actions Bot mentioned this pull request Jun 18, 2026

[aw] Smoke Chroot failed #5257

Open

This was referenced Jun 18, 2026

[aw] Smoke Copilot BYOK failed #5259

Open

[aw] Smoke Copilot BYOK AOAI (api-key) failed #5260

Open

github-actions Bot added the build-test label Jun 18, 2026

lpcox merged commit 41cf5ac into main Jun 18, 2026
90 of 105 checks passed

lpcox deleted the fix/copilot-usage-token-details branch June 18, 2026 18:26

Conversation

lpcox commented Jun 18, 2026

Problem

Fix

Before → After (real run shape)

Tests

Scope note

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Coverage Check Passed

Overall Coverage

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

lpcox commented Jun 18, 2026

Uh oh!

Copilot AI commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

🔥 Smoke Test: Copilot PAT Auth — FAIL

Uh oh!

github-actions Bot commented Jun 18, 2026

🔬 Smoke Test Results

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

🔍 Smoke Test: API Proxy OpenTelemetry Tracing

Uh oh!

github-actions Bot commented Jun 18, 2026

Smoke Test Results

Uh oh!

github-actions Bot commented Jun 18, 2026

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading