docs: update compare command references for N-way matrix mode#384
Merged
Conversation
All references to `agentv compare` previously only documented two-file pairwise mode. Updated to show the N-way matrix as the primary workflow with --baseline, --candidate, and --targets flags. Updated files: - README.md (root + CLI): matrix output example, baseline/pairwise commands - docs/COMPARISON.md: CI example with --baseline regression gate - examples/features/compare/: N-way matrix + pairwise examples with output - examples/showcase/multi-model-benchmark/: combined JSONL workflow - plugins/agentv-dev/skills/agentv-eval-builder/: compare command reference
Deploying agentv with
|
| Latest commit: |
0650078
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://41cbee75.agentv.pages.dev |
| Branch Preview URL: | https://docs-compare-nway-matrix.agentv.pages.dev |
The evaluators block was renamed to assert in the eval YAML schema. Update both code examples in COMPARISON.md to use the current syntax.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
All documentation and skills previously only referenced the two-file pairwise
agentv compare a.jsonl b.jsonlworkflow. Updated to show N-way matrix as the primary workflow with--baseline,--candidate, and--targetsflags.Files updated:
README.md(root + CLI mirror): matrix output example, baseline/pairwise/two-file commandsdocs/COMPARISON.md: CI/CD example with--baselineregression gateexamples/features/compare/README.md: N-way matrix + pairwise modesexamples/features/compare/evals/README.md: full output examples for all modesexamples/showcase/multi-model-benchmark/README.md: combined JSONL workflow, updated flow diagramplugins/agentv-dev/skills/agentv-eval-builder/SKILL.md: compare command referenceTest plan
agentv compareoutput from fixturesFollows up #382, #383.