diff --git a/.changeset/openrouter-cost-tracking.md b/.changeset/openrouter-cost-tracking.md new file mode 100644 index 000000000..6ab751711 --- /dev/null +++ b/.changeset/openrouter-cost-tracking.md @@ -0,0 +1,27 @@ +--- +'@tanstack/ai-openrouter': minor +'@tanstack/ai': minor +--- + +Surface OpenRouter's per-request cost on `RUN_FINISHED.usage`. + +OpenRouter reports the actual cost of each request inline on the chat response. +The `openRouterText` and `openRouterResponsesText` adapters now forward that +value on the terminal `RUN_FINISHED` event as `usage.cost`, with OpenRouter's +per-request breakdown under `usage.costDetails`. This is the cost OpenRouter +itself reports — it is not computed locally from token counts, so it accounts +for routing, fallback providers, BYOK, and cached-token pricing. + +`@tanstack/ai` adds a shared `UsageTotals` type with optional `cost` and +`costDetails` fields, plus a provider-neutral `UsageCostBreakdown` interface +with three canonical fields (`upstreamCost`, `upstreamInputCost`, +`upstreamOutputCost`). Each adapter's extractor normalizes its provider's +wire-shape onto this canonical form, so consumer code reads the same fields +regardless of which gateway populated them — swapping adapters is a one-line +change with no consumer rewrites. The OpenRouter adapter collapses its two +endpoint naming styles (Chat Completions' `prompt`/`completions` and +Responses' `input`/`output`) onto the same canonical input/output split, since +they bill against the same tokens. `RunFinishedEvent.usage`, the middleware +`UsageInfo` (`onUsage`), and `FinishInfo.usage` (`onFinish`) all use +`UsageTotals`. The fields are optional and additive — adapters that do not +report cost are unaffected. diff --git a/README.md b/README.md index 1c98040b8..7b2ac19aa 100644 --- a/README.md +++ b/README.md @@ -186,7 +186,7 @@ Official adapters include: | Package | Use it for | | ------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------ | -| [`@tanstack/ai-openrouter`](https://tanstack.com/ai/latest/docs/adapters/openrouter) | 300+ models through one OpenRouter API | +| [`@tanstack/ai-openrouter`](https://tanstack.com/ai/latest/docs/adapters/openrouter) | 300+ models through one OpenRouter API, with per-request cost tracking | | [`@tanstack/ai-openai`](https://tanstack.com/ai/latest/docs/adapters/openai) | OpenAI chat, image, video, speech, transcription, realtime, and provider tools | | [`@tanstack/ai-anthropic`](https://tanstack.com/ai/latest/docs/adapters/anthropic) | Anthropic Claude chat, thinking, tools, and structured outputs | | [`@tanstack/ai-gemini`](https://tanstack.com/ai/latest/docs/adapters/gemini) | Google Gemini chat, image, speech, and audio generation | diff --git a/docs/adapters/openrouter.md b/docs/adapters/openrouter.md index 856ac1065..27a9cce09 100644 --- a/docs/adapters/openrouter.md +++ b/docs/adapters/openrouter.md @@ -169,6 +169,38 @@ Caveats while the Responses adapter is in beta: - If in doubt, prefer `openRouterText`. The Chat Completions endpoint has broader provider coverage and feature parity today. +## Cost Tracking + +OpenRouter reports the actual cost of each request inline on the streamed +response. When present, the adapter forwards it on the terminal `RUN_FINISHED` +event under `usage.cost`, with OpenRouter's per-request breakdown under +`usage.costDetails`. This is the cost OpenRouter itself reports for the +request — it is **not** computed locally from token counts, so it already +accounts for routing, fallback providers, BYOK, and cached-token pricing. See +OpenRouter's [Usage Accounting](https://openrouter.ai/docs/use-cases/usage-accounting) +docs for the meaning and units of these fields. + +```typescript +import { chat } from "@tanstack/ai"; +import { openRouterText } from "@tanstack/ai-openrouter"; + +for await (const chunk of chat({ + adapter: openRouterText("openai/gpt-5"), + messages: [{ role: "user", content: "Hello!" }], +})) { + if (chunk.type === "RUN_FINISHED") { + console.log("cost:", chunk.usage?.cost); + console.log("breakdown:", chunk.usage?.costDetails); + } +} +``` + +The same `usage` (including `cost` / `costDetails`) is passed to middleware via +the `onUsage` and `onFinish` hooks. When OpenRouter does not report a cost, the +fields are simply absent and the stream completes normally. Both +`openRouterText` and `openRouterResponsesText` populate cost when OpenRouter +returns it. + ## Next Steps - [Getting Started](../getting-started/quick-start) - Learn the basics diff --git a/packages/ai-openrouter/src/adapters/cost.ts b/packages/ai-openrouter/src/adapters/cost.ts new file mode 100644 index 000000000..dcd264d03 --- /dev/null +++ b/packages/ai-openrouter/src/adapters/cost.ts @@ -0,0 +1,101 @@ +/** + * Helpers for extracting OpenRouter's provider-reported per-request cost from the + * SDK usage object and shaping it for `RUN_FINISHED.usage`. + * + * OpenRouter returns an authoritative per-request `cost` plus an optional + * `cost_details` breakdown. We forward `cost` verbatim and normalize the + * breakdown onto `@tanstack/ai`'s canonical `UsageCostBreakdown` shape — so + * consumer code reads the same three fields regardless of which adapter (or + * which OpenRouter endpoint) produced them. OpenRouter exposes the breakdown + * under two naming families (Chat Completions: `prompt`/`completions`, + * Responses: `input`/`output`); both map onto the same canonical input/output + * split, because they bill against the same tokens. + * + * Input is intentionally typed `unknown`: callers pass usage objects whose static + * types are narrowed to token-only fields (notably the Responses adapter), and the + * Responses usage normalizer can leave `cost_details` in snake_case. Reading both + * `costDetails` and `cost_details` and narrowing here keeps every call site simple. + */ + +import type { UsageCostBreakdown } from '@tanstack/ai' + +export interface ExtractedCost { + cost?: number + costDetails?: UsageCostBreakdown +} + +/** + * Wire-key → canonical-key mapping. Snake_case keys come from the raw/UNKNOWN + * `response.completed` fallback in the Responses adapter; camelCase keys come + * from the SDK-parsed path. Both Chat Completions' prompt/completions naming + * and Responses' input/output naming collapse onto `upstreamInputCost` / + * `upstreamOutputCost`. + */ +const KNOWN_DETAIL_KEYS: Record = { + upstream_inference_cost: 'upstreamCost', + upstreamInferenceCost: 'upstreamCost', + upstream_inference_prompt_cost: 'upstreamInputCost', + upstreamInferencePromptCost: 'upstreamInputCost', + upstream_inference_input_cost: 'upstreamInputCost', + upstreamInferenceInputCost: 'upstreamInputCost', + upstream_inference_completions_cost: 'upstreamOutputCost', + upstreamInferenceCompletionsCost: 'upstreamOutputCost', + upstream_inference_output_cost: 'upstreamOutputCost', + upstreamInferenceOutputCost: 'upstreamOutputCost', +} + +function asRecord(value: unknown): Record | undefined { + return typeof value === 'object' && value !== null + ? (value as Record) + : undefined +} + +/** + * Narrow a raw `cost_details`/`costDetails` map to the canonical fields of + * `UsageCostBreakdown`. Negative values (e.g. discounts) are preserved; `null`, + * non-finite numbers, non-numeric values, and unknown keys are dropped. + */ +function extractCostDetails(details: unknown): UsageCostBreakdown | undefined { + const record = asRecord(details) + if (!record) return undefined + + const out: UsageCostBreakdown = {} + for (const [rawKey, value] of Object.entries(record)) { + const key = KNOWN_DETAIL_KEYS[rawKey] + if (!key) continue + if (typeof value === 'number' && Number.isFinite(value)) { + out[key] = value + } + } + + return Object.keys(out).length > 0 ? out : undefined +} + +/** + * Extract `cost`/`costDetails` from a provider usage object. + * + * - `cost` is attached only when it is a finite number — this preserves `cost === 0` + * and rejects `NaN`/`Infinity`, and does not clamp negative values. + * - `costDetails` is attached only alongside a valid `cost` (an orphan breakdown + * without a total cannot be reconciled and is dropped). Both camelCase + * `costDetails` and snake_case `cost_details` are read. + * + * Returns an empty object when no usable cost is present, so call sites can spread + * the result unconditionally. + */ +export function extractUsageCost(usage: unknown): ExtractedCost { + const record = asRecord(usage) + if (!record) return {} + + const cost = record.cost + if (typeof cost !== 'number' || !Number.isFinite(cost)) return {} + + const costDetails = extractCostDetails( + record.costDetails ?? record.cost_details, + ) + + return { + cost, + ...(costDetails && { costDetails }), + } +} diff --git a/packages/ai-openrouter/src/adapters/responses-text.ts b/packages/ai-openrouter/src/adapters/responses-text.ts index ae4037a27..edea58f92 100644 --- a/packages/ai-openrouter/src/adapters/responses-text.ts +++ b/packages/ai-openrouter/src/adapters/responses-text.ts @@ -9,6 +9,7 @@ import { convertFunctionToolToResponsesFormat } from '../internal/responses-tool import { isWebSearchTool } from '../tools/web-search-tool' import { isWebFetchTool } from '../tools/web-fetch-tool' import { getOpenRouterApiKeyFromEnv } from '../utils' +import { extractUsageCost } from './cost' import type { SDKOptions } from '@openrouter/sdk' import type { ResponsesFunctionTool } from '../internal/responses-tool-converter' import type { @@ -623,6 +624,7 @@ export class OpenRouterResponsesTextAdapter< promptTokens: usage.inputTokens ?? 0, completionTokens: usage.outputTokens ?? 0, totalTokens: usage.totalTokens ?? 0, + ...extractUsageCost(usage), }, }), } @@ -1433,6 +1435,7 @@ export class OpenRouterResponsesTextAdapter< promptTokens: responseObj.usage?.inputTokens || 0, completionTokens: responseObj.usage?.outputTokens || 0, totalTokens: responseObj.usage?.totalTokens || 0, + ...extractUsageCost(responseObj.usage), }, finishReason, } diff --git a/packages/ai-openrouter/src/adapters/text.ts b/packages/ai-openrouter/src/adapters/text.ts index c48394953..ee5a3a5c2 100644 --- a/packages/ai-openrouter/src/adapters/text.ts +++ b/packages/ai-openrouter/src/adapters/text.ts @@ -7,6 +7,7 @@ import { extractRequestOptions } from '../internal/request-options' import { makeStructuredOutputCompatible } from '../internal/schema-converter' import { convertToolsToProviderFormat } from '../tools' import { getOpenRouterApiKeyFromEnv } from '../utils' +import { extractUsageCost } from './cost' import type { SDKOptions } from '@openrouter/sdk' import type { ChatContentItems, @@ -549,6 +550,7 @@ export class OpenRouterTextAdapter< promptTokens: lastUsage.promptTokens, completionTokens: lastUsage.completionTokens, totalTokens: lastUsage.totalTokens, + ...extractUsageCost(lastUsage), }, }), } @@ -1076,6 +1078,7 @@ export class OpenRouterTextAdapter< promptTokens: lastUsage.promptTokens || 0, completionTokens: lastUsage.completionTokens || 0, totalTokens: lastUsage.totalTokens || 0, + ...extractUsageCost(lastUsage), }, }), finishReason, diff --git a/packages/ai-openrouter/tests/cost.test.ts b/packages/ai-openrouter/tests/cost.test.ts new file mode 100644 index 000000000..cd4d0af77 --- /dev/null +++ b/packages/ai-openrouter/tests/cost.test.ts @@ -0,0 +1,145 @@ +import { describe, expect, it } from 'vitest' +import { extractUsageCost } from '../src/adapters/cost' + +describe('extractUsageCost', () => { + it('extracts a finite cost', () => { + expect(extractUsageCost({ cost: 0.0123 })).toEqual({ cost: 0.0123 }) + }) + + it('preserves cost === 0 (not treated as absent)', () => { + expect(extractUsageCost({ cost: 0 })).toEqual({ cost: 0 }) + }) + + it('returns empty object when cost is absent', () => { + expect(extractUsageCost({ promptTokens: 5 })).toEqual({}) + }) + + it('returns empty object for non-number / non-finite cost', () => { + expect(extractUsageCost({ cost: '0.5' })).toEqual({}) + expect(extractUsageCost({ cost: NaN })).toEqual({}) + expect(extractUsageCost({ cost: Infinity })).toEqual({}) + expect(extractUsageCost({ cost: null })).toEqual({}) + }) + + it('returns empty object for non-object input', () => { + expect(extractUsageCost(undefined)).toEqual({}) + expect(extractUsageCost(null)).toEqual({}) + expect(extractUsageCost(42)).toEqual({}) + }) + + it('reads costDetails (camelCase) and normalizes to canonical keys', () => { + expect( + extractUsageCost({ + cost: 0.01, + costDetails: { upstreamInferenceCost: 0.008 }, + }), + ).toEqual({ cost: 0.01, costDetails: { upstreamCost: 0.008 } }) + }) + + it('reads cost_details (snake_case) and normalizes to canonical keys', () => { + expect( + extractUsageCost({ + cost: 0.01, + cost_details: { upstream_inference_cost: 0.008 }, + }), + ).toEqual({ cost: 0.01, costDetails: { upstreamCost: 0.008 } }) + }) + + it('collapses Chat Completions prompt/completions onto canonical input/output', () => { + expect( + extractUsageCost({ + cost: 0.0042, + cost_details: { + upstream_inference_completions_cost: 0.0026, + upstream_inference_cost: 0.0038, + upstream_inference_prompt_cost: 0.0012, + }, + }), + ).toEqual({ + cost: 0.0042, + costDetails: { + upstreamOutputCost: 0.0026, + upstreamCost: 0.0038, + upstreamInputCost: 0.0012, + }, + }) + }) + + it('collapses Responses input/output onto the same canonical input/output', () => { + expect( + extractUsageCost({ + cost: 0.0042, + cost_details: { + upstream_inference_cost: 0.0038, + upstream_inference_input_cost: 0.0012, + upstream_inference_output_cost: 0.0026, + }, + }), + ).toEqual({ + cost: 0.0042, + costDetails: { + upstreamCost: 0.0038, + upstreamInputCost: 0.0012, + upstreamOutputCost: 0.0026, + }, + }) + }) + + it('prefers camelCase costDetails when both are present', () => { + expect( + extractUsageCost({ + cost: 0.01, + costDetails: { upstreamInferenceCost: 1 }, + cost_details: { upstream_inference_cost: 2 }, + }), + ).toEqual({ cost: 0.01, costDetails: { upstreamCost: 1 } }) + }) + + it('preserves negative detail values (e.g. cache discount)', () => { + expect( + extractUsageCost({ + cost: 0.01, + costDetails: { upstreamInferenceCost: -0.002 }, + }), + ).toEqual({ cost: 0.01, costDetails: { upstreamCost: -0.002 } }) + }) + + it('drops null, non-finite, and non-numeric detail entries', () => { + expect( + extractUsageCost({ + cost: 0.01, + costDetails: { + upstreamInferenceCost: 0.5, + upstreamInferenceInputCost: null, + upstreamInferenceOutputCost: Infinity, + upstreamInferencePromptCost: NaN, + upstreamInferenceCompletionsCost: 'x', + }, + }), + ).toEqual({ cost: 0.01, costDetails: { upstreamCost: 0.5 } }) + }) + + it('drops unknown breakdown keys', () => { + expect( + extractUsageCost({ + cost: 0.01, + costDetails: { + upstreamInferenceCost: 0.008, + futureUnknownField: 0.001, + }, + }), + ).toEqual({ cost: 0.01, costDetails: { upstreamCost: 0.008 } }) + }) + + it('omits costDetails entirely when no known entries remain', () => { + expect( + extractUsageCost({ cost: 0.01, costDetails: { unknownKey: 1 } }), + ).toEqual({ cost: 0.01 }) + }) + + it('drops an orphan costDetails when cost is absent', () => { + expect( + extractUsageCost({ costDetails: { upstreamInferenceCost: 0.008 } }), + ).toEqual({}) + }) +}) diff --git a/packages/ai-openrouter/tests/openrouter-adapter.test.ts b/packages/ai-openrouter/tests/openrouter-adapter.test.ts index e010db062..02a5edb15 100644 --- a/packages/ai-openrouter/tests/openrouter-adapter.test.ts +++ b/packages/ai-openrouter/tests/openrouter-adapter.test.ts @@ -2275,3 +2275,110 @@ describe('OpenRouter convertMessage fail-loud guards', () => { expect(assistantMsg.content).toBeNull() }) }) + +describe('OpenRouter cost tracking', () => { + // OpenRouter sends final token usage (and cost) on a trailing chunk whose + // `choices` is empty, arriving AFTER the finishReason chunk. The adapter + // defers RUN_FINISHED until the stream drains so this chunk is captured. + const baseStream = ( + usage: Record, + ): Array> => [ + { + id: 'chatcmpl-cost', + model: 'openai/gpt-4o-mini', + choices: [{ delta: { content: 'Hi' }, finishReason: null }], + }, + { + id: 'chatcmpl-cost', + model: 'openai/gpt-4o-mini', + choices: [{ delta: {}, finishReason: 'stop' }], + }, + { + id: 'chatcmpl-cost', + model: 'openai/gpt-4o-mini', + choices: [], + usage, + }, + ] + + const runFinished = async (usage: Record) => { + setupMockSdkClient(baseStream(usage)) + const chunks: Array = [] + for await (const chunk of chat({ + adapter: createAdapter(), + messages: [{ role: 'user', content: 'Hi' }], + })) { + chunks.push(chunk) + } + return chunks.find((c) => c.type === 'RUN_FINISHED') + } + + it('forwards cost and costDetails from the trailing usage chunk', async () => { + // The cost details mirror the full shape the @openrouter/sdk parser emits + // (camelCased), not a partial one the real parser would reject. + const runFinishedChunk = await runFinished({ + promptTokens: 5, + completionTokens: 2, + totalTokens: 7, + cost: 0.0042, + costDetails: { + upstreamInferenceCompletionsCost: 0.0026, + upstreamInferenceCost: 0.0038, + upstreamInferencePromptCost: 0.0012, + }, + }) + expect(runFinishedChunk).toMatchObject({ + type: 'RUN_FINISHED', + usage: { + promptTokens: 5, + completionTokens: 2, + totalTokens: 7, + cost: 0.0042, + costDetails: { + upstreamCost: 0.0038, + upstreamInputCost: 0.0012, + upstreamOutputCost: 0.0026, + }, + }, + }) + }) + + it('preserves cost === 0', async () => { + const runFinishedChunk = await runFinished({ + promptTokens: 5, + completionTokens: 2, + totalTokens: 7, + cost: 0, + }) + expect( + runFinishedChunk?.type === 'RUN_FINISHED' && runFinishedChunk.usage, + ).toMatchObject({ cost: 0 }) + }) + + it('omits cost when the provider does not report it', async () => { + const runFinishedChunk = await runFinished({ + promptTokens: 5, + completionTokens: 2, + totalTokens: 7, + }) + expect(runFinishedChunk?.type).toBe('RUN_FINISHED') + if (runFinishedChunk?.type === 'RUN_FINISHED') { + expect(runFinishedChunk.usage).toMatchObject({ totalTokens: 7 }) + expect(runFinishedChunk.usage).not.toHaveProperty('cost') + expect(runFinishedChunk.usage).not.toHaveProperty('costDetails') + } + }) + + it('does not break streaming on a malformed cost (cost omitted)', async () => { + const runFinishedChunk = await runFinished({ + promptTokens: 5, + completionTokens: 2, + totalTokens: 7, + cost: 'not-a-number', + }) + expect(runFinishedChunk?.type).toBe('RUN_FINISHED') + if (runFinishedChunk?.type === 'RUN_FINISHED') { + expect(runFinishedChunk.usage).not.toHaveProperty('cost') + } + }) +}) diff --git a/packages/ai-openrouter/tests/openrouter-responses-adapter.test.ts b/packages/ai-openrouter/tests/openrouter-responses-adapter.test.ts index 17d86ec5e..4e1ac11ca 100644 --- a/packages/ai-openrouter/tests/openrouter-responses-adapter.test.ts +++ b/packages/ai-openrouter/tests/openrouter-responses-adapter.test.ts @@ -996,3 +996,130 @@ describe('OpenRouter responses adapter — SDK constructor wiring', () => { expect(options.headers).toEqual(headers) }) }) + +describe('OpenRouter responses adapter — cost tracking', () => { + beforeEach(() => { + vi.clearAllMocks() + }) + + const runFinishedFor = async (usage: Record) => { + setupMockSdkClient([ + { + type: 'response.completed', + sequenceNumber: 1, + response: { + model: 'openai/gpt-4o-mini', + output: [], + usage, + }, + }, + ]) + const adapter = createAdapter() + const chunks: Array = [] + for await (const chunk of adapter.chatStream({ + model: 'openai/gpt-4o-mini' as any, + messages: [{ role: 'user', content: 'hi' }], + logger: testLogger, + })) { + chunks.push(chunk) + } + return chunks.find((c) => c.type === 'RUN_FINISHED') + } + + it('forwards cost and costDetails from response.completed usage', async () => { + // Responses UsageCostDetails uses input/output cost (not the chat + // prompt/completions shape) per @openrouter/sdk@0.12.35. + const runFinishedChunk = await runFinishedFor({ + inputTokens: 5, + outputTokens: 2, + totalTokens: 7, + cost: 0.0042, + costDetails: { + upstreamInferenceCost: 0.0038, + upstreamInferenceInputCost: 0.0012, + upstreamInferenceOutputCost: 0.0026, + }, + }) + expect(runFinishedChunk).toMatchObject({ + type: 'RUN_FINISHED', + usage: { + promptTokens: 5, + completionTokens: 2, + totalTokens: 7, + cost: 0.0042, + costDetails: { + upstreamCost: 0.0038, + upstreamInputCost: 0.0012, + upstreamOutputCost: 0.0026, + }, + }, + }) + }) + + it('omits cost when the provider does not report it', async () => { + const runFinishedChunk = await runFinishedFor({ + inputTokens: 5, + outputTokens: 2, + totalTokens: 7, + }) + expect(runFinishedChunk?.type).toBe('RUN_FINISHED') + if (runFinishedChunk?.type === 'RUN_FINISHED') { + expect(runFinishedChunk.usage).not.toHaveProperty('cost') + expect(runFinishedChunk.usage).not.toHaveProperty('costDetails') + } + }) + + // The SDK surfaces wire shapes it can't parse as { isUnknown, raw } events, + // where usage stays in raw snake_case. costDetails keys must still normalize + // to canonical camelCase so this fallback path matches the SDK-parsed path. + it('normalizes cost details from a raw (UNKNOWN) response.completed event', async () => { + setupMockSdkClient([ + { + isUnknown: true, + raw: { + type: 'response.completed', + sequence_number: 1, + response: { + model: 'openai/gpt-4o-mini', + output: [], + usage: { + input_tokens: 11, + output_tokens: 3, + total_tokens: 14, + cost: 0.0042, + cost_details: { + upstream_inference_cost: 0.0038, + upstream_inference_input_cost: 0.0012, + upstream_inference_output_cost: 0.0026, + }, + }, + }, + }, + }, + ]) + const adapter = createAdapter() + const chunks: Array = [] + for await (const chunk of adapter.chatStream({ + model: 'openai/gpt-4o-mini' as any, + messages: [{ role: 'user', content: 'hi' }], + logger: testLogger, + })) { + chunks.push(chunk) + } + const runFinishedChunk = chunks.find((c) => c.type === 'RUN_FINISHED') + expect(runFinishedChunk).toMatchObject({ + type: 'RUN_FINISHED', + usage: { + promptTokens: 11, + completionTokens: 3, + totalTokens: 14, + cost: 0.0042, + costDetails: { + upstreamCost: 0.0038, + upstreamInputCost: 0.0012, + upstreamOutputCost: 0.0026, + }, + }, + }) + }) +}) diff --git a/packages/ai/src/activities/chat/middleware/types.ts b/packages/ai/src/activities/chat/middleware/types.ts index 76456d573..8e507bd70 100644 --- a/packages/ai/src/activities/chat/middleware/types.ts +++ b/packages/ai/src/activities/chat/middleware/types.ts @@ -4,6 +4,7 @@ import type { StreamChunk, Tool, ToolCall, + UsageTotals, } from '../../../types' import type { SystemPrompt } from '../../../system-prompts' @@ -264,12 +265,12 @@ export interface ToolPhaseCompleteInfo { /** * Token usage statistics passed to the onUsage hook. * Extracted from the RUN_FINISHED chunk when usage data is present. + * + * Includes optional provider-reported `cost`/`costDetails` (see {@link UsageTotals}). + * Kept as an interface extending `UsageTotals` to preserve declaration merging for + * this publicly exported type. */ -export interface UsageInfo { - promptTokens: number - completionTokens: number - totalTokens: number -} +export interface UsageInfo extends UsageTotals {} // =========================== // Terminal Hook Info @@ -285,14 +286,8 @@ export interface FinishInfo { duration: number /** Final accumulated text content */ content: string - /** Final usage totals, if available */ - usage?: - | { - promptTokens: number - completionTokens: number - totalTokens: number - } - | undefined + /** Final usage totals, if available (optionally including provider-reported cost) */ + usage?: UsageTotals | undefined } /** diff --git a/packages/ai/src/types.ts b/packages/ai/src/types.ts index 6c596a2dc..c04adfa84 100644 --- a/packages/ai/src/types.ts +++ b/packages/ai/src/types.ts @@ -925,6 +925,39 @@ export interface RunStartedEvent extends AGUIRunStartedEvent { model?: string } +/** + * Provider-reported cost breakdown for a single request, normalized onto a + * canonical shape so consumer code is portable across gateways. Each adapter's + * extractor maps its provider-specific wire keys (e.g. OpenRouter's + * `upstream_inference_prompt_cost`, `upstream_inference_input_cost`) onto these + * fields at runtime. + */ +export interface UsageCostBreakdown { + /** Total cost the gateway paid the upstream provider. */ + upstreamCost?: number + /** Upstream cost for input (prompt) tokens. */ + upstreamInputCost?: number + /** Upstream cost for output (completion) tokens. */ + upstreamOutputCost?: number +} + +/** + * Token usage totals for a run, optionally including provider-reported cost. + * + * `cost` and `costDetails` are populated only by adapters whose provider returns + * authoritative per-request cost (e.g. OpenRouter). They are absent for adapters + * that do not report cost, so consumers must treat them as optional. + */ +export interface UsageTotals { + promptTokens: number + completionTokens: number + totalTokens: number + /** Provider-reported cost for the request, when available. */ + cost?: number + /** Provider-reported cost breakdown, when available. */ + costDetails?: UsageCostBreakdown +} + /** * Emitted when a run completes successfully. * @@ -936,12 +969,8 @@ export interface RunFinishedEvent extends AGUIRunFinishedEvent { model?: string /** Why the generation stopped */ finishReason?: 'stop' | 'length' | 'content_filter' | 'tool_calls' | null - /** Token usage statistics */ - usage?: { - promptTokens: number - completionTokens: number - totalTokens: number - } + /** Token usage statistics, optionally including provider-reported cost. */ + usage?: UsageTotals } /** diff --git a/packages/ai/tests/usage-cost-types.test.ts b/packages/ai/tests/usage-cost-types.test.ts new file mode 100644 index 000000000..8934c04aa --- /dev/null +++ b/packages/ai/tests/usage-cost-types.test.ts @@ -0,0 +1,61 @@ +import { describe, expectTypeOf, it } from 'vitest' +import type { + RunFinishedEvent, + UsageCostBreakdown, + UsageTotals, +} from '../src/types' +import type { + FinishInfo, + UsageInfo, +} from '../src/activities/chat/middleware/types' + +// Locks the additive cost contract: the optional `cost`/`costDetails` fields +// must be present on every public usage surface so middleware and event +// consumers can read provider-reported cost without casts. The breakdown shape +// is canonical (provider-neutral) — adapter extractors normalize their +// wire-specific keys onto these three fields. +describe('usage cost type surface', () => { + it('UsageTotals exposes optional cost and a UsageCostBreakdown', () => { + expectTypeOf().toEqualTypeOf() + expectTypeOf().toEqualTypeOf< + UsageCostBreakdown | undefined + >() + }) + + it('UsageCostBreakdown enumerates the canonical breakdown fields', () => { + expectTypeOf().toEqualTypeOf< + number | undefined + >() + expectTypeOf().toEqualTypeOf< + number | undefined + >() + expectTypeOf().toEqualTypeOf< + number | undefined + >() + }) + + it('RunFinishedEvent.usage carries cost/costDetails', () => { + expectTypeOf< + NonNullable['cost'] + >().toEqualTypeOf() + expectTypeOf< + NonNullable['costDetails'] + >().toEqualTypeOf() + }) + + it('UsageInfo (onUsage) carries cost/costDetails', () => { + expectTypeOf().toEqualTypeOf() + expectTypeOf().toEqualTypeOf< + UsageCostBreakdown | undefined + >() + }) + + it('FinishInfo.usage (onFinish) carries cost/costDetails', () => { + expectTypeOf['cost']>().toEqualTypeOf< + number | undefined + >() + expectTypeOf< + NonNullable['costDetails'] + >().toEqualTypeOf() + }) +}) diff --git a/testing/e2e/global-setup.ts b/testing/e2e/global-setup.ts index 5a08e63d0..198f7e780 100644 --- a/testing/e2e/global-setup.ts +++ b/testing/e2e/global-setup.ts @@ -51,6 +51,13 @@ export default async function globalSetup() { // the Anthropic adapter here. mock.mount('/anthropic-bug-test', anthropicServerToolBugMount()) + // OpenRouter per-request cost capture. aimock's OpenAI-compatible chat + // helper doesn't synthesize OpenRouter's `usage.cost` / `usage.cost_details`, + // and crucially those land on a trailing usage-only chunk (choices: []) that + // arrives AFTER the finish_reason chunk. This mount hand-crafts that exact + // wire shape so the companion spec can assert cost reaches RUN_FINISHED.usage. + mock.mount('/openrouter-cost', openRouterCostMount()) + await mock.start() console.log(`[aimock] started on port 4010`) ;(globalThis as any).__aimock = mock @@ -299,6 +306,84 @@ function anthropicServerToolBugMount(): Mountable { } } +/** + * Emits an OpenAI-compatible chat-completion SSE stream that ends with a + * usage-only trailing chunk carrying OpenRouter's `cost` / `cost_details`. + * Snake_case on the wire is camelCased by the `@openrouter/sdk` parser, so the + * adapter sees `usage.cost` and `usage.costDetails.upstreamInferenceCost`. + */ +function openRouterCostMount(): Mountable { + return { + async handleRequest( + req: http.IncomingMessage, + res: http.ServerResponse, + pathname: string, + ): Promise { + // The mount prefix (/openrouter-cost) is stripped before dispatch; the + // SDK posts to /chat/completions where serverURL ends in /v1. + if ( + req.method !== 'POST' || + !pathname.startsWith('/v1/chat/completions') + ) { + return false + } + await drainBody(req) + + const base = { + id: 'chatcmpl-cost-e2e', + object: 'chat.completion.chunk', + // The @openrouter/sdk chunk schema requires a numeric `created`. + created: 1700000000, + model: 'openai/gpt-4o', + } + const chunks: Array> = [ + { + ...base, + choices: [ + { + index: 0, + delta: { role: 'assistant', content: 'Hi' }, + finish_reason: null, + }, + ], + }, + { + ...base, + choices: [{ index: 0, delta: {}, finish_reason: 'stop' }], + }, + // Trailing usage-only chunk — the whole point of the test. Field names + // mirror OpenRouter's CostDetails schema (camelCased by the SDK parser). + { + ...base, + choices: [], + usage: { + prompt_tokens: 11, + completion_tokens: 3, + total_tokens: 14, + cost: 0.0042, + cost_details: { + upstream_inference_completions_cost: 0.0026, + upstream_inference_cost: 0.0038, + upstream_inference_prompt_cost: 0.0012, + }, + }, + }, + ] + + res.statusCode = 200 + res.setHeader('Content-Type', 'text/event-stream') + res.setHeader('Cache-Control', 'no-cache') + res.setHeader('Connection', 'keep-alive') + for (const chunk of chunks) { + res.write(`data: ${JSON.stringify(chunk)}\n\n`) + } + res.write('data: [DONE]\n\n') + res.end() + return true + }, + } +} + function buildToolPlusServerToolEvents(): Array> { const messageId = 'msg_bug_604' const model = 'claude-sonnet-4-5' diff --git a/testing/e2e/src/routeTree.gen.ts b/testing/e2e/src/routeTree.gen.ts index 9270f412d..90a70acdc 100644 --- a/testing/e2e/src/routeTree.gen.ts +++ b/testing/e2e/src/routeTree.gen.ts @@ -26,6 +26,7 @@ import { Route as ApiTranscriptionRouteImport } from './routes/api.transcription import { Route as ApiToolsTestRouteImport } from './routes/api.tools-test' import { Route as ApiSummarizeRouteImport } from './routes/api.summarize' import { Route as ApiOpenrouterWebToolsWireRouteImport } from './routes/api.openrouter-web-tools-wire' +import { Route as ApiOpenrouterCostRouteImport } from './routes/api.openrouter-cost' import { Route as ApiMiddlewareTestRouteImport } from './routes/api.middleware-test' import { Route as ApiImageRouteImport } from './routes/api.image' import { Route as ApiChatRouteImport } from './routes/api.chat' @@ -124,6 +125,11 @@ const ApiOpenrouterWebToolsWireRoute = path: '/api/openrouter-web-tools-wire', getParentRoute: () => rootRouteImport, } as any) +const ApiOpenrouterCostRoute = ApiOpenrouterCostRouteImport.update({ + id: '/api/openrouter-cost', + path: '/api/openrouter-cost', + getParentRoute: () => rootRouteImport, +} as any) const ApiMiddlewareTestRoute = ApiMiddlewareTestRouteImport.update({ id: '/api/middleware-test', path: '/api/middleware-test', @@ -197,6 +203,7 @@ export interface FileRoutesByFullPath { '/api/chat': typeof ApiChatRoute '/api/image': typeof ApiImageRouteWithChildren '/api/middleware-test': typeof ApiMiddlewareTestRoute + '/api/openrouter-cost': typeof ApiOpenrouterCostRoute '/api/openrouter-web-tools-wire': typeof ApiOpenrouterWebToolsWireRoute '/api/summarize': typeof ApiSummarizeRoute '/api/tools-test': typeof ApiToolsTestRoute @@ -227,6 +234,7 @@ export interface FileRoutesByTo { '/api/chat': typeof ApiChatRoute '/api/image': typeof ApiImageRouteWithChildren '/api/middleware-test': typeof ApiMiddlewareTestRoute + '/api/openrouter-cost': typeof ApiOpenrouterCostRoute '/api/openrouter-web-tools-wire': typeof ApiOpenrouterWebToolsWireRoute '/api/summarize': typeof ApiSummarizeRoute '/api/tools-test': typeof ApiToolsTestRoute @@ -258,6 +266,7 @@ export interface FileRoutesById { '/api/chat': typeof ApiChatRoute '/api/image': typeof ApiImageRouteWithChildren '/api/middleware-test': typeof ApiMiddlewareTestRoute + '/api/openrouter-cost': typeof ApiOpenrouterCostRoute '/api/openrouter-web-tools-wire': typeof ApiOpenrouterWebToolsWireRoute '/api/summarize': typeof ApiSummarizeRoute '/api/tools-test': typeof ApiToolsTestRoute @@ -290,6 +299,7 @@ export interface FileRouteTypes { | '/api/chat' | '/api/image' | '/api/middleware-test' + | '/api/openrouter-cost' | '/api/openrouter-web-tools-wire' | '/api/summarize' | '/api/tools-test' @@ -320,6 +330,7 @@ export interface FileRouteTypes { | '/api/chat' | '/api/image' | '/api/middleware-test' + | '/api/openrouter-cost' | '/api/openrouter-web-tools-wire' | '/api/summarize' | '/api/tools-test' @@ -350,6 +361,7 @@ export interface FileRouteTypes { | '/api/chat' | '/api/image' | '/api/middleware-test' + | '/api/openrouter-cost' | '/api/openrouter-web-tools-wire' | '/api/summarize' | '/api/tools-test' @@ -381,6 +393,7 @@ export interface RootRouteChildren { ApiChatRoute: typeof ApiChatRoute ApiImageRoute: typeof ApiImageRouteWithChildren ApiMiddlewareTestRoute: typeof ApiMiddlewareTestRoute + ApiOpenrouterCostRoute: typeof ApiOpenrouterCostRoute ApiOpenrouterWebToolsWireRoute: typeof ApiOpenrouterWebToolsWireRoute ApiSummarizeRoute: typeof ApiSummarizeRoute ApiToolsTestRoute: typeof ApiToolsTestRoute @@ -511,6 +524,13 @@ declare module '@tanstack/react-router' { preLoaderRoute: typeof ApiOpenrouterWebToolsWireRouteImport parentRoute: typeof rootRouteImport } + '/api/openrouter-cost': { + id: '/api/openrouter-cost' + path: '/api/openrouter-cost' + fullPath: '/api/openrouter-cost' + preLoaderRoute: typeof ApiOpenrouterCostRouteImport + parentRoute: typeof rootRouteImport + } '/api/middleware-test': { id: '/api/middleware-test' path: '/api/middleware-test' @@ -666,6 +686,7 @@ const rootRouteChildren: RootRouteChildren = { ApiChatRoute: ApiChatRoute, ApiImageRoute: ApiImageRouteWithChildren, ApiMiddlewareTestRoute: ApiMiddlewareTestRoute, + ApiOpenrouterCostRoute: ApiOpenrouterCostRoute, ApiOpenrouterWebToolsWireRoute: ApiOpenrouterWebToolsWireRoute, ApiSummarizeRoute: ApiSummarizeRoute, ApiToolsTestRoute: ApiToolsTestRoute, diff --git a/testing/e2e/src/routes/api.openrouter-cost.ts b/testing/e2e/src/routes/api.openrouter-cost.ts new file mode 100644 index 000000000..8c0fe56cf --- /dev/null +++ b/testing/e2e/src/routes/api.openrouter-cost.ts @@ -0,0 +1,54 @@ +import { createFileRoute } from '@tanstack/react-router' +import { chat, createChatOptions } from '@tanstack/ai' +import { createOpenRouterText } from '@tanstack/ai-openrouter' + +const LLMOCK_DEFAULT_BASE = process.env.LLMOCK_URL || 'http://127.0.0.1:4010' +const DUMMY_KEY = 'sk-e2e-test-dummy-key' + +/** + * Drives the OpenRouter chat-completions adapter against a hand-crafted aimock + * mount (`/openrouter-cost`) whose stream ends with a usage-only chunk carrying + * `cost` / `cost_details`. The companion spec asserts that those values reach + * `RUN_FINISHED.usage` — proving the adapter forwards OpenRouter's + * provider-reported per-request cost. + */ +export const Route = createFileRoute('/api/openrouter-cost')({ + server: { + handlers: { + POST: async () => { + const adapter = createOpenRouterText( + 'openai/gpt-4o' as never, + DUMMY_KEY, + { + serverURL: `${LLMOCK_DEFAULT_BASE}/openrouter-cost/v1`, + }, + ) + + let usage: Record | undefined + try { + for await (const chunk of chat({ + ...createChatOptions({ adapter }), + messages: [{ role: 'user', content: 'hi' }], + })) { + if (chunk.type === 'RUN_FINISHED') { + usage = chunk.usage as Record | undefined + } + } + } catch (error) { + return new Response( + JSON.stringify({ + ok: false, + error: error instanceof Error ? error.message : String(error), + }), + { status: 200, headers: { 'Content-Type': 'application/json' } }, + ) + } + + return new Response(JSON.stringify({ ok: true, usage }), { + status: 200, + headers: { 'Content-Type': 'application/json' }, + }) + }, + }, + }, +}) diff --git a/testing/e2e/tests/openrouter-cost.spec.ts b/testing/e2e/tests/openrouter-cost.spec.ts new file mode 100644 index 000000000..8b1e4da96 --- /dev/null +++ b/testing/e2e/tests/openrouter-cost.spec.ts @@ -0,0 +1,43 @@ +import type { UsageCostBreakdown } from '@tanstack/ai' +import { test, expect } from './fixtures' + +/** + * Verifies that OpenRouter's provider-reported per-request cost reaches + * `RUN_FINISHED.usage`. The `/api/openrouter-cost` route drives the OpenRouter + * chat adapter against a hand-crafted aimock mount whose stream ends with a + * usage-only chunk carrying `cost` / `cost_details` (snake_case on the wire, + * camelCased by the SDK parser). The adapter defers RUN_FINISHED until the + * stream drains, so that trailing chunk is captured. + */ +test.describe('openrouter — per-request cost', () => { + test('cost and costDetails reach RUN_FINISHED.usage', async ({ request }) => { + const res = await request.post('/api/openrouter-cost') + expect(res.ok()).toBe(true) + + const { ok, usage, error } = (await res.json()) as { + ok: boolean + error?: string + usage?: { + promptTokens?: number + completionTokens?: number + totalTokens?: number + cost?: number + costDetails?: UsageCostBreakdown + } + } + + expect(error ?? null).toBeNull() + expect(ok).toBe(true) + expect(usage).toMatchObject({ + promptTokens: 11, + completionTokens: 3, + totalTokens: 14, + cost: 0.0042, + costDetails: { + upstreamCost: 0.0038, + upstreamInputCost: 0.0012, + upstreamOutputCost: 0.0026, + }, + }) + }) +})