Problem
AgentV results are viewed in the terminal or by reading JSONL files. For larger eval suites, a visual interface makes it much easier to:
- Browse and filter results
- Compare runs side-by-side
- Drill into individual case evaluator scores
- Share results with team members
Proposal
Add a local web viewer that serves eval results:
# View results from a completed run
agentv view results.jsonl
# View and compare two runs
agentv view before.jsonl after.jsonl
# Auto-open after eval
agentv run --target my-agent evals/ --view
Features
- Results table — sortable by score, duration, cost, pass/fail
- Case detail view — expand to see all evaluator scores and reasons
- Run comparison — side-by-side delta view (reuses existing
compare logic)
- Filters — by pass/fail, evaluator type, score range
- No external dependencies — self-contained, runs locally
Design Principles
- CLI-first remains primary — the viewer is supplementary, not required
- No server signup — runs locally on
localhost
- No persistent backend — reads JSONL files directly
- Lightweight — minimal framework, fast startup
Why This Matters
- Visual exploration catches patterns that tables miss
- Comparison view is the most common evaluation workflow
- Accessibility for non-CLI users (PMs, stakeholders)
- Stays true to AgentV principles: local, no signup, no overhead
Acceptance Criteria
Problem
AgentV results are viewed in the terminal or by reading JSONL files. For larger eval suites, a visual interface makes it much easier to:
Proposal
Add a local web viewer that serves eval results:
Features
comparelogic)Design Principles
localhostWhy This Matters
Acceptance Criteria
agentv viewcommand serves local web UI