Agent-side AI-credits calculator under-reports AIC for Anthropic (subtracts cache_read from input_tokens)

## Summary

The agent-side AI-credits (AIC) calculator in `actions/setup/js/model_costs.cjs` **under-reports AI credits for Anthropic models**. It recomputes AIC from raw token counts and incorrectly subtracts `cache_read_tokens` from Anthropic's `input_tokens`. Anthropic already reports `input_tokens` *exclusive* of cached tokens (`cache_read_input_tokens` / `cache_creation_input_tokens` are reported **separately and are additive**), so the subtraction zeroes out the genuinely-fresh input credit.

This produces a persistent drift between:
- `agent_usage.json` / `GH_AW_AIC` (computed by the harness), and
- the firewall api-proxy's authoritative `ai_credits_total` in `token-usage.jsonl`.

The `verify_token_usage` smoke job surfaces this as a non-blocking warning:

> `ai_credits drift: last ai_credits_total is 45.971025, agent_usage reports 44.719.`

## Root cause

`actions/setup/js/model_costs.cjs`

```js
// line ~203, computeInferenceCostUSD()
const effectiveInput = providerIncludesCacheReadsInInput(provider)
  ? Math.max(input - cacheRead, 0)
  : input;
```

```js
// line ~98
function providerIncludesCacheReadsInInput(provider) {
  switch (normalizeProvider(provider)) {
    case "":
    case "anthropic":      // <-- incorrect for Anthropic
    case "openai":
    case "azure-openai":
    case "azure_openai":
      return true;
    default:
      return false;
  }
}
```

**Provider token semantics differ:**

| Provider | `input_tokens` semantics | Subtract cache from input? |
|----------|--------------------------|----------------------------|
| OpenAI / Azure OpenAI | **Total** input; cached tokens are a **subset** | ✅ Yes (avoid double-count) |
| **Anthropic** | **Non-cached input only**; `cache_read` + `cache_creation` are **separate & additive** | ❌ **No** |

Returning `true` for `anthropic` (and the empty/default `""` case, which assumes OpenAI semantics) makes the harness subtract `cache_read` from an `input_tokens` value that never included it, so for cache-heavy Anthropic requests `effectiveInput` collapses to `0` and the fresh-input credit is dropped.

This is the mirror image of a fix just landed in the firewall api-proxy (githubnext/gh-aw-firewall PR #5271), where `calculateAiCredits` was made provider-aware so it does **not** subtract cache for Anthropic. The proxy is now correct; the harness recompute is not, hence the drift.

## Reproduction & evidence

**Run:** Smoke Claude — https://github.com/github/gh-aw-firewall/actions/runs/27803071455 (engine: claude, model resolved to `claude-opus-4-7`, gh-aw harness `v0.79.6`)

Proxy `token-usage.jsonl` (4 successful Anthropic requests, authoritative per-response AIC):

| # | input | output | cache_read | cache_write | `ai_credits_this_response` |
|---|------:|-------:|-----------:|------------:|---------------------------:|
| 1 | 2498 | 22 | 0 | 0 | 1.304 |
| 2 | 5 | 145 | 0 | 59981 | 37.853125 |
| 3 | 1 | 82 | 59981 | 221 | 3.342675 |
| 4 | 1 | 135 | 60202 | 197 | 3.471225 |
| **Σ** | **2505** | **384** | **120183** | **60399** | **45.971025** |

opus-4-7 pricing ($/M): input 5.00, cache_read 0.50, cache_write 6.25, output 25.00. Each per-response value above checks out exactly, and the cumulative proxy total is **45.971025**.

`agent_usage.json` from the same run agrees on **raw tokens** (input 2505, output 384, cache_read 120183, cache_write 60399) but reports `ai_credits: 44.719`.

Reconciling the difference with the buggy formula (Anthropic, subtract cache_read → `effectiveInput = max(2505 − 120183, 0) = 0`):

```
input:       0       × 5.00 / 10000 = 0
cache_read:  120183  × 0.50 / 10000 = 6.00915
cache_write: 60399   × 6.25 / 10000 = 37.749375
output:      384      × 25.00 / 10000 = 0.96
total                                = 44.718525  ≈ 44.719   ✅ matches agent_usage
```

The exact gap (`45.971025 − 44.719 ≈ 1.2525`) equals the dropped fresh-input credit (`2505 × 5.00 / 10000`). This confirms the bug is solely the Anthropic cache-read subtraction.

## Impact

- `GH_AW_AIC`, `agent_usage.json.ai_credits`, the step-summary token table, and any OTEL `ai_credits` derived from the harness **under-report** AIC for Anthropic models whenever cache reads are present (i.e., almost every multi-turn Claude run).
- Causes a recurring, confusing `verify_token_usage` drift warning even when the proxy accounting is correct.
- AIC-budget reasoning that relies on the harness number (rather than the proxy's) is skewed low for Anthropic.

## Suggested fixes

**Option A (minimal/targeted):** In `providerIncludesCacheReadsInInput`, return `false` for `anthropic` so Anthropic input is not cache-subtracted. (Reconsider the empty/default `""` case too — it currently assumes OpenAI semantics.)

**Option B (preferred, more robust):** When the proxy `token-usage.jsonl` already carries authoritative `ai_credits_this_response` / `ai_credits_total`, **trust those values** instead of recomputing AIC from raw tokens in the harness. The api-proxy computes AIC at request time with full provider-aware pricing and is the source of truth; recomputing downstream invites exactly this kind of divergence.

Either fix eliminates the drift. Option B additionally future-proofs against provider/pricing logic drifting between the proxy and the harness.

## References
- Harness AIC formula: `actions/setup/js/model_costs.cjs` (`computeInferenceCostUSD` ~L203, `providerIncludesCacheReadsInInput` ~L98)
- Harness recompute call sites: `actions/setup/js/parse_mcp_gateway_log.cjs` (`computeInferenceAIC`, L128–146), `actions/setup/js/parse_token_usage.cjs` (writes `agent_usage.json`)
- Corresponding proxy-side fix: githubnext/gh-aw-firewall PR #5271 (provider-aware `calculateAiCredits`)


Provider	`input_tokens` semantics	Subtract cache from input?
OpenAI / Azure OpenAI	Total input; cached tokens are a subset	✅ Yes (avoid double-count)
Anthropic	Non-cached input only; `cache_read` + `cache_creation` are separate & additive	❌ No

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent-side AI-credits calculator under-reports AIC for Anthropic (subtracts cache_read from input_tokens) #40205

Summary

Root cause

Reproduction & evidence

Impact

Suggested fixes

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

#	input	output	cache_read	cache_write	`ai_credits_this_response`
1	2498	22	0	0	1.304
2	5	145	0	59981	37.853125
3	1	82	59981	221	3.342675
4	1	135	60202	197	3.471225
Σ	2505	384	120183	60399	45.971025

Agent-side AI-credits calculator under-reports AIC for Anthropic (subtracts cache_read from input_tokens) #40205

Description

Summary

Root cause

Reproduction & evidence

Impact

Suggested fixes

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions