You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The agent-side AI-credits (AIC) calculator in actions/setup/js/model_costs.cjsunder-reports AI credits for Anthropic models. It recomputes AIC from raw token counts and incorrectly subtracts cache_read_tokens from Anthropic's input_tokens. Anthropic already reports input_tokensexclusive of cached tokens (cache_read_input_tokens / cache_creation_input_tokens are reported separately and are additive), so the subtraction zeroes out the genuinely-fresh input credit.
This produces a persistent drift between:
agent_usage.json / GH_AW_AIC (computed by the harness), and
the firewall api-proxy's authoritative ai_credits_total in token-usage.jsonl.
The verify_token_usage smoke job surfaces this as a non-blocking warning:
ai_credits drift: last ai_credits_total is 45.971025, agent_usage reports 44.719.
Root cause
actions/setup/js/model_costs.cjs
// line ~203, computeInferenceCostUSD()consteffectiveInput=providerIncludesCacheReadsInInput(provider)
? Math.max(input-cacheRead,0)
: input;
// line ~98functionproviderIncludesCacheReadsInInput(provider){switch(normalizeProvider(provider)){case"":
case"anthropic": // <-- incorrect for Anthropiccase"openai":
case"azure-openai":
case"azure_openai":
returntrue;default:
returnfalse;}}
Provider token semantics differ:
Provider
input_tokens semantics
Subtract cache from input?
OpenAI / Azure OpenAI
Total input; cached tokens are a subset
✅ Yes (avoid double-count)
Anthropic
Non-cached input only; cache_read + cache_creation are separate & additive
❌ No
Returning true for anthropic (and the empty/default "" case, which assumes OpenAI semantics) makes the harness subtract cache_read from an input_tokens value that never included it, so for cache-heavy Anthropic requests effectiveInput collapses to 0 and the fresh-input credit is dropped.
This is the mirror image of a fix just landed in the firewall api-proxy (githubnext/gh-aw-firewall PR #5271), where calculateAiCredits was made provider-aware so it does not subtract cache for Anthropic. The proxy is now correct; the harness recompute is not, hence the drift.
opus-4-7 pricing ($/M): input 5.00, cache_read 0.50, cache_write 6.25, output 25.00. Each per-response value above checks out exactly, and the cumulative proxy total is 45.971025.
agent_usage.json from the same run agrees on raw tokens (input 2505, output 384, cache_read 120183, cache_write 60399) but reports ai_credits: 44.719.
Reconciling the difference with the buggy formula (Anthropic, subtract cache_read → effectiveInput = max(2505 − 120183, 0) = 0):
The exact gap (45.971025 − 44.719 ≈ 1.2525) equals the dropped fresh-input credit (2505 × 5.00 / 10000). This confirms the bug is solely the Anthropic cache-read subtraction.
Impact
GH_AW_AIC, agent_usage.json.ai_credits, the step-summary token table, and any OTEL ai_credits derived from the harness under-report AIC for Anthropic models whenever cache reads are present (i.e., almost every multi-turn Claude run).
Causes a recurring, confusing verify_token_usage drift warning even when the proxy accounting is correct.
AIC-budget reasoning that relies on the harness number (rather than the proxy's) is skewed low for Anthropic.
Suggested fixes
Option A (minimal/targeted): In providerIncludesCacheReadsInInput, return false for anthropic so Anthropic input is not cache-subtracted. (Reconsider the empty/default "" case too — it currently assumes OpenAI semantics.)
Option B (preferred, more robust): When the proxy token-usage.jsonl already carries authoritative ai_credits_this_response / ai_credits_total, trust those values instead of recomputing AIC from raw tokens in the harness. The api-proxy computes AIC at request time with full provider-aware pricing and is the source of truth; recomputing downstream invites exactly this kind of divergence.
Either fix eliminates the drift. Option B additionally future-proofs against provider/pricing logic drifting between the proxy and the harness.
Summary
The agent-side AI-credits (AIC) calculator in
actions/setup/js/model_costs.cjsunder-reports AI credits for Anthropic models. It recomputes AIC from raw token counts and incorrectly subtractscache_read_tokensfrom Anthropic'sinput_tokens. Anthropic already reportsinput_tokensexclusive of cached tokens (cache_read_input_tokens/cache_creation_input_tokensare reported separately and are additive), so the subtraction zeroes out the genuinely-fresh input credit.This produces a persistent drift between:
agent_usage.json/GH_AW_AIC(computed by the harness), andai_credits_totalintoken-usage.jsonl.The
verify_token_usagesmoke job surfaces this as a non-blocking warning:Root cause
actions/setup/js/model_costs.cjsProvider token semantics differ:
input_tokenssemanticscache_read+cache_creationare separate & additiveReturning
trueforanthropic(and the empty/default""case, which assumes OpenAI semantics) makes the harness subtractcache_readfrom aninput_tokensvalue that never included it, so for cache-heavy Anthropic requestseffectiveInputcollapses to0and the fresh-input credit is dropped.This is the mirror image of a fix just landed in the firewall api-proxy (githubnext/gh-aw-firewall PR #5271), where
calculateAiCreditswas made provider-aware so it does not subtract cache for Anthropic. The proxy is now correct; the harness recompute is not, hence the drift.Reproduction & evidence
Run: Smoke Claude — https://github.com/github/gh-aw-firewall/actions/runs/27803071455 (engine: claude, model resolved to
claude-opus-4-7, gh-aw harnessv0.79.6)Proxy
token-usage.jsonl(4 successful Anthropic requests, authoritative per-response AIC):ai_credits_this_responseopus-4-7 pricing ($/M): input 5.00, cache_read 0.50, cache_write 6.25, output 25.00. Each per-response value above checks out exactly, and the cumulative proxy total is 45.971025.
agent_usage.jsonfrom the same run agrees on raw tokens (input 2505, output 384, cache_read 120183, cache_write 60399) but reportsai_credits: 44.719.Reconciling the difference with the buggy formula (Anthropic, subtract cache_read →
effectiveInput = max(2505 − 120183, 0) = 0):The exact gap (
45.971025 − 44.719 ≈ 1.2525) equals the dropped fresh-input credit (2505 × 5.00 / 10000). This confirms the bug is solely the Anthropic cache-read subtraction.Impact
GH_AW_AIC,agent_usage.json.ai_credits, the step-summary token table, and any OTELai_creditsderived from the harness under-report AIC for Anthropic models whenever cache reads are present (i.e., almost every multi-turn Claude run).verify_token_usagedrift warning even when the proxy accounting is correct.Suggested fixes
Option A (minimal/targeted): In
providerIncludesCacheReadsInInput, returnfalseforanthropicso Anthropic input is not cache-subtracted. (Reconsider the empty/default""case too — it currently assumes OpenAI semantics.)Option B (preferred, more robust): When the proxy
token-usage.jsonlalready carries authoritativeai_credits_this_response/ai_credits_total, trust those values instead of recomputing AIC from raw tokens in the harness. The api-proxy computes AIC at request time with full provider-aware pricing and is the source of truth; recomputing downstream invites exactly this kind of divergence.Either fix eliminates the drift. Option B additionally future-proofs against provider/pricing logic drifting between the proxy and the harness.
References
actions/setup/js/model_costs.cjs(computeInferenceCostUSD~L203,providerIncludesCacheReadsInInput~L98)actions/setup/js/parse_mcp_gateway_log.cjs(computeInferenceAIC, L128–146),actions/setup/js/parse_token_usage.cjs(writesagent_usage.json)calculateAiCredits)