Conversation
…t selection Summary: - Add grouped format option displaying multiple LLM answers per question side-by-side - Add format selector dropdown and CSV export for both formats: [row, grouped], by default it would be in row - Support CSV export for grouped format
📝 WalkthroughWalkthroughThe changes introduce support for displaying and exporting evaluation results in a grouped format alongside the existing row format. This includes backend support for an export_format query parameter, a new grouped result table component, type definitions for grouped data structures, UI controls for format selection, and CSV export functions. Changes
Sequence DiagramsequenceDiagram
participant User
participant Page as Evaluation Page
participant Backend as Backend API
participant Component as DetailedResultsTable
User->>Page: Select export format (Grouped/Row)
Page->>Page: Update exportFormat state
Page->>Backend: Fetch evaluation with export_format parameter
Backend->>Backend: Process format parameter
Backend->>Page: Return evaluation data in requested format
Page->>Component: Pass evaluation data with traces
alt Grouped Format Detected
Component->>Component: isGroupedFormat check
Component->>Component: Render GroupedResultsTable
Component->>Component: Calculate max answers, column widths
Component->>Component: Render multi-column table with grouped data
else Row Format
Component->>Component: normalizeToIndividualScores
Component->>Component: Render individual score rows
end
Component->>User: Display formatted results
User->>Page: Click export CSV
Page->>Page: handleExportCSV orchestrates
Page->>Page: Detect format and call appropriate exporter
Page->>User: Download CSV file
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/evaluations/[id]/page.tsx (1)
324-354: Row export is blocked for non‑V2 score formats.
handleExportCSVnow returns an error unless the score object is V2, which prevents export forNewScoreObjecteven thoughexportRowCSV()supports it via normalization. This looks like a regression for existing data.✅ Suggested fix
- if (!isNewScoreObjectV2(scoreObject)) { - toast.error('Export not available for this score format'); - return; - } - - const traces = scoreObject.traces; + if (!isNewScoreObject(scoreObject) && !isNewScoreObjectV2(scoreObject)) { + toast.error('Export not available for this score format'); + return; + } + + if (!isNewScoreObjectV2(scoreObject)) { + exportRowCSV(); + return; + } + + const traces = scoreObject.traces;
🤖 Fix all issues with AI agents
In `@app/api/evaluations/`[id]/route.ts:
- Around line 54-61: Validate and whitelist the incoming export_format before
forwarding: read the raw value from request.nextUrl.searchParams, check it
against an explicit set of allowed values (e.g., 'row', 'json', 'csv' — whatever
your backend supports) and if it isn't in the whitelist default to 'row', then
use that sanitized value when calling url.searchParams.set('export_format',
...). Update the logic around exportFormat, searchParams, and
url.searchParams.set to enforce this whitelist so arbitrary values are never
forwarded to backendUrl.
In `@app/components/types.ts`:
- Around line 59-61: normalizeToIndividualScores currently assumes
NewScoreObjectV2.traces are TraceItem and will emit nested trace_scores when
traces are grouped; update normalizeToIndividualScores to detect grouped format
using an isGroupedFormat guard (or use existing
isNewScoreObjectV2()/isLegacyScoreObject() runtime checks), and when traces are
GroupedTraceItem[] flatten them into individual TraceItem entries before
producing trace_scores (or skip/return safely if flattening isn't applicable) so
trace_scores remains a flat array.
🧹 Nitpick comments (2)
app/evaluations/[id]/page.tsx (1)
196-257: Build grouped CSV headers from all scores, not just the first answer.Using
traces[0]?.scores[0]can miss score columns that appear only in other answers/groups, leading to dropped data in exports. Consider building a union of score names across all groups/answers.♻️ Suggested improvement
- const scoreNames = traces[0]?.scores[0]?.map(s => s.name) || []; + const scoreNames = Array.from( + new Set( + traces.flatMap(group => + (group.scores ?? []).flatMap(scores => (scores ?? []).map(s => s.name)) + ) + ) + );app/components/DetailedResultsTable.tsx (1)
304-345: Consider de‑duplicatingformatScoreValueacross row/grouped tables.The grouped renderer re-implements score formatting; extracting a shared helper would prevent divergence and simplify future changes.
| const searchParams = request.nextUrl.searchParams; | ||
| const exportFormat = searchParams.get('export_format') || 'row'; | ||
|
|
||
| // Build URL with query parameters | ||
| const url = new URL(`${backendUrl}/api/v1/evaluations/${id}`); | ||
| url.searchParams.set('get_trace_info', 'true'); | ||
| url.searchParams.set('resync_score', 'false'); | ||
| url.searchParams.set('export_format', exportFormat); |
There was a problem hiding this comment.
Whitelist export_format before forwarding.
Right now any query value gets passed through; invalid values can trigger backend errors or unexpected behavior. Consider validating against supported options and defaulting to row.
✅ Suggested fix
- const exportFormat = searchParams.get('export_format') || 'row';
+ const exportFormatParam = searchParams.get('export_format');
+ const exportFormat =
+ exportFormatParam === 'row' || exportFormatParam === 'grouped'
+ ? exportFormatParam
+ : 'row';🤖 Prompt for AI Agents
In `@app/api/evaluations/`[id]/route.ts around lines 54 - 61, Validate and
whitelist the incoming export_format before forwarding: read the raw value from
request.nextUrl.searchParams, check it against an explicit set of allowed values
(e.g., 'row', 'json', 'csv' — whatever your backend supports) and if it isn't in
the whitelist default to 'row', then use that sanitized value when calling
url.searchParams.set('export_format', ...). Update the logic around
exportFormat, searchParams, and url.searchParams.set to enforce this whitelist
so arbitrary values are never forwarded to backendUrl.
| export interface NewScoreObjectV2 { | ||
| summary_scores: SummaryScore[]; | ||
| traces: TraceItem[]; | ||
| traces: TraceItem[] | GroupedTraceItem[]; |
There was a problem hiding this comment.
Normalize grouped traces before treating them as TraceItem.
NewScoreObjectV2.traces can now be grouped, but normalizeToIndividualScores still assumes TraceItem and will emit malformed trace_scores (nested arrays) when grouped traces flow through. Guard with isGroupedFormat and flatten or return safely.
💡 Suggested fix (flatten grouped traces)
if (isNewScoreObjectV2(score)) {
+ if (isGroupedFormat(score.traces)) {
+ return score.traces.flatMap(group =>
+ group.llm_answers.map((answer, idx) => ({
+ trace_id: group.trace_ids?.[idx] ?? '',
+ input: { question: group.question },
+ output: { answer },
+ metadata: { ground_truth: group.ground_truth_answer },
+ trace_scores: group.scores?.[idx] ?? []
+ }))
+ );
+ }
// Convert TraceItem[] to IndividualScore[]
return score.traces.map(trace => ({
trace_id: trace.trace_id,
input: { question: trace.question },
output: { answer: trace.llm_answer },
metadata: { ground_truth: trace.ground_truth_answer },
trace_scores: trace.scores
}));
}🤖 Prompt for AI Agents
In `@app/components/types.ts` around lines 59 - 61, normalizeToIndividualScores
currently assumes NewScoreObjectV2.traces are TraceItem and will emit nested
trace_scores when traces are grouped; update normalizeToIndividualScores to
detect grouped format using an isGroupedFormat guard (or use existing
isNewScoreObjectV2()/isLegacyScoreObject() runtime checks), and when traces are
GroupedTraceItem[] flatten them into individual TraceItem entries before
producing trace_scores (or skip/return safely if flattening isn't applicable) so
trace_scores remains a flat array.
Target Issue: #7
Summary:
Add grouped format option displaying multiple LLM answers per question side-by-side
Add format selector dropdown and CSV export for both formats: [row, grouped], by default it would be in row
Support CSV export for grouped format
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.