feat(core): add OpenAI Codex SDK provider replacing CLI subprocess

## Summary

Replace the current `codex` provider (CLI subprocess via `spawn`) with a new provider using the official [`@openai/codex`](https://www.npmjs.com/package/@openai/codex) SDK. This follows the same pattern as the `copilot` provider (PR #211) and the planned `claude` provider (#213).

## Motivation

The current Codex provider spawns `codex` CLI as a subprocess and scrapes JSONL output. The SDK provides:
- **Structured responses** with typed messages
- **Token usage** from response metadata
- **Cost tracking** from SDK-reported usage
- **Tool call extraction** as structured data (not scraped text)
- **Abort/timeout control** via SDK options

## Competitive Advantage

Like the Claude and Copilot SDK providers, this gives AgentV structured access to agent internals — tool calls, token usage, cost — enabling `tool_trajectory`, `tool_call_f1`, and `execution_metrics` evaluators to work out of the box. Most eval frameworks treat Codex as a text-in/text-out black box.

## Implementation Plan

Follow the same pattern as `copilot` provider (PR #211):

### Files to modify
1. **`types.ts`** - Update `ProviderKind` to add `codex-sdk` or alias existing `codex`
2. **`targets.ts`** - Update resolved config and target resolution
3. **`index.ts`** - Wire new provider
4. **`targets-validator.ts`** - Update settings validation

### Files to create
1. **`codex-sdk.ts`** - New provider using `@openai/codex` SDK
   - Lazy-load SDK (same pattern as copilot/claude providers)
   - Extract structured tool calls → `ToolCall[]` and `OutputMessage[]`
   - Extract token usage, cost, duration from SDK response
2. **`codex-sdk.test.ts`** - Unit tests with mocked SDK

### Files to update/delete
1. **`codex.ts`** - Keep as fallback or deprecate
2. **`codex-log-tracker.ts`** - Update for SDK events

### Dependencies
- Add `@openai/codex` to `packages/core/package.json`
- Add to CLI `tsup.config.ts` external list

## Key Design Decisions

- **Naming**: `codex` (canonical using SDK), old CLI provider available as `codex-cli` if needed
- **Tool calls**: Extract structured tool use data for trajectory evaluation
- **Backward compat**: Existing `codex` configs should work with the new provider

## Supersedes

This supersedes #99 (execution metrics for Codex) — the SDK provider will return metrics natively.

## References

- PR #211 - `copilot` provider implementation (pattern to follow)
- #213 - `claude` provider using Agent SDK (same pattern)
- Current implementation: `packages/core/src/evaluation/providers/codex.ts`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): add OpenAI Codex SDK provider replacing CLI subprocess #236

Summary

Motivation

Competitive Advantage

Implementation Plan

Files to modify

Files to create

Files to update/delete

Dependencies

Key Design Decisions

Supersedes

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(core): add OpenAI Codex SDK provider replacing CLI subprocess #236

Description

Summary

Motivation

Competitive Advantage

Implementation Plan

Files to modify

Files to create

Files to update/delete

Dependencies

Key Design Decisions

Supersedes

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions