Evaluation: Group Export by vprashrex · Pull Request #25 · ProjectTech4DevAI/kaapi-frontend

vprashrex · 2026-01-29T09:28:57Z

Target Issue: #7

Summary:

Add grouped format option displaying multiple LLM answers per question side-by-side
Add format selector dropdown and CSV export for both formats: [row, grouped], by default it would be in row
Support CSV export for grouped format

Summary by CodeRabbit

New Features
- Added grouped format display for evaluation results with a format selector to toggle between row and grouped layouts
- Extended CSV export functionality to support both row and grouped format exports

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…t selection Summary: - Add grouped format option displaying multiple LLM answers per question side-by-side - Add format selector dropdown and CSV export for both formats: [row, grouped], by default it would be in row - Support CSV export for grouped format

coderabbitai · 2026-01-29T09:29:12Z

📝 Walkthrough

Walkthrough

The changes introduce support for displaying and exporting evaluation results in a grouped format alongside the existing row format. This includes backend support for an export_format query parameter, a new grouped result table component, type definitions for grouped data structures, UI controls for format selection, and CSV export functions.

Changes

Cohort / File(s)	Summary
Backend API Support `app/api/evaluations/[id]/route.ts`	Added support for export_format query parameter; appends it to backend URL while preserving existing parameters like get_trace_info and resync_score.
Type Definitions `app/components/types.ts`	Introduced GroupedTraceItem interface for grouped trace structure, added isGroupedFormat type guard to detect grouped format, and updated NewScoreObjectV2.traces to accept union of TraceItem or GroupedTraceItem arrays.
UI Components `app/components/DetailedResultsTable.tsx`	Added new GroupedResultsTable internal component that renders a multi-column table for grouped traces with dynamic column generation, fixed column widths for horizontal scrolling, and reused color-coding and tooltip logic. Maintains backward compatibility with row format via pre-check.
Page Integration `app/evaluations/[id]/page.tsx`	Added exportFormat state and format selector UI control, introduced three CSV export functions (exportGroupedCSV, exportRowCSV, handleExportCSV) with format detection logic, and integrated export_format parameter into evaluation fetch requests.

Sequence Diagram

sequenceDiagram
    participant User
    participant Page as Evaluation Page
    participant Backend as Backend API
    participant Component as DetailedResultsTable

    User->>Page: Select export format (Grouped/Row)
    Page->>Page: Update exportFormat state
    Page->>Backend: Fetch evaluation with export_format parameter
    Backend->>Backend: Process format parameter
    Backend->>Page: Return evaluation data in requested format
    Page->>Component: Pass evaluation data with traces
    
    alt Grouped Format Detected
        Component->>Component: isGroupedFormat check
        Component->>Component: Render GroupedResultsTable
        Component->>Component: Calculate max answers, column widths
        Component->>Component: Render multi-column table with grouped data
    else Row Format
        Component->>Component: normalizeToIndividualScores
        Component->>Component: Render individual score rows
    end
    
    Component->>User: Display formatted results
    User->>Page: Click export CSV
    Page->>Page: handleExportCSV orchestrates
    Page->>Page: Detect format and call appropriate exporter
    Page->>User: Download CSV file

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A grouped format hops in today,
Rows and answers side-by-side to play,
Type guards dance, components align,
Export formats arrange just fine,
With colors and scrolls, the view's divine! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Evaluation: Group Export' is vague and does not clearly convey the main changes. While it mentions evaluation and grouping/export, it lacks specificity about what features were added or what the grouped format entails.	Consider using a more descriptive title that captures the core feature, such as 'Add grouped format and CSV export options for evaluations' or 'Support grouped results display and dual-format CSV export'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

app/evaluations/[id]/page.tsx (1)

324-354: Row export is blocked for non‑V2 score formats.

handleExportCSV now returns an error unless the score object is V2, which prevents export for NewScoreObject even though exportRowCSV() supports it via normalization. This looks like a regression for existing data.

✅ Suggested fix

-      if (!isNewScoreObjectV2(scoreObject)) {
-        toast.error('Export not available for this score format');
-        return;
-      }
-
-      const traces = scoreObject.traces;
+      if (!isNewScoreObject(scoreObject) && !isNewScoreObjectV2(scoreObject)) {
+        toast.error('Export not available for this score format');
+        return;
+      }
+
+      if (!isNewScoreObjectV2(scoreObject)) {
+        exportRowCSV();
+        return;
+      }
+
+      const traces = scoreObject.traces;

🤖 Fix all issues with AI agents

In `@app/api/evaluations/`[id]/route.ts:
- Around line 54-61: Validate and whitelist the incoming export_format before
forwarding: read the raw value from request.nextUrl.searchParams, check it
against an explicit set of allowed values (e.g., 'row', 'json', 'csv' — whatever
your backend supports) and if it isn't in the whitelist default to 'row', then
use that sanitized value when calling url.searchParams.set('export_format',
...). Update the logic around exportFormat, searchParams, and
url.searchParams.set to enforce this whitelist so arbitrary values are never
forwarded to backendUrl.

In `@app/components/types.ts`:
- Around line 59-61: normalizeToIndividualScores currently assumes
NewScoreObjectV2.traces are TraceItem and will emit nested trace_scores when
traces are grouped; update normalizeToIndividualScores to detect grouped format
using an isGroupedFormat guard (or use existing
isNewScoreObjectV2()/isLegacyScoreObject() runtime checks), and when traces are
GroupedTraceItem[] flatten them into individual TraceItem entries before
producing trace_scores (or skip/return safely if flattening isn't applicable) so
trace_scores remains a flat array.

🧹 Nitpick comments (2)

app/evaluations/[id]/page.tsx (1)
196-257: Build grouped CSV headers from all scores, not just the first answer.

Using traces[0]?.scores[0] can miss score columns that appear only in other answers/groups, leading to dropped data in exports. Consider building a union of score names across all groups/answers.
♻️ Suggested improvement
-      const scoreNames = traces[0]?.scores[0]?.map(s => s.name) || [];
+      const scoreNames = Array.from(
+        new Set(
+          traces.flatMap(group =>
+            (group.scores ?? []).flatMap(scores => (scores ?? []).map(s => s.name))
+          )
+        )
+      );
app/components/DetailedResultsTable.tsx (1)

304-345: Consider de‑duplicating formatScoreValue across row/grouped tables.

The grouped renderer re-implements score formatting; extracting a shared helper would prevent divergence and simplify future changes.

coderabbitai · 2026-01-29T09:36:55Z

app/api/evaluations/[id]/route.ts

+    const searchParams = request.nextUrl.searchParams;
+    const exportFormat = searchParams.get('export_format') || 'row';
+
+    // Build URL with query parameters
    const url = new URL(`${backendUrl}/api/v1/evaluations/${id}`);
    url.searchParams.set('get_trace_info', 'true');
    url.searchParams.set('resync_score', 'false');
+    url.searchParams.set('export_format', exportFormat);


⚠️ Potential issue | 🟡 Minor

Whitelist export_format before forwarding.

Right now any query value gets passed through; invalid values can trigger backend errors or unexpected behavior. Consider validating against supported options and defaulting to row.

✅ Suggested fix

- const exportFormat = searchParams.get('export_format') || 'row'; + const exportFormatParam = searchParams.get('export_format'); + const exportFormat = + exportFormatParam === 'row' || exportFormatParam === 'grouped' + ? exportFormatParam + : 'row';

🤖 Prompt for AI Agents

In `@app/api/evaluations/`[id]/route.ts around lines 54 - 61, Validate and whitelist the incoming export_format before forwarding: read the raw value from request.nextUrl.searchParams, check it against an explicit set of allowed values (e.g., 'row', 'json', 'csv' — whatever your backend supports) and if it isn't in the whitelist default to 'row', then use that sanitized value when calling url.searchParams.set('export_format', ...). Update the logic around exportFormat, searchParams, and url.searchParams.set to enforce this whitelist so arbitrary values are never forwarded to backendUrl.

coderabbitai · 2026-01-29T09:36:55Z

app/components/types.ts

 export interface NewScoreObjectV2 {
  summary_scores: SummaryScore[];
-  traces: TraceItem[];
+  traces: TraceItem[] | GroupedTraceItem[];


⚠️ Potential issue | 🟠 Major

Normalize grouped traces before treating them as TraceItem.

NewScoreObjectV2.traces can now be grouped, but normalizeToIndividualScores still assumes TraceItem and will emit malformed trace_scores (nested arrays) when grouped traces flow through. Guard with isGroupedFormat and flatten or return safely.

💡 Suggested fix (flatten grouped traces)

if (isNewScoreObjectV2(score)) { + if (isGroupedFormat(score.traces)) { + return score.traces.flatMap(group => + group.llm_answers.map((answer, idx) => ({ + trace_id: group.trace_ids?.[idx] ?? '', + input: { question: group.question }, + output: { answer }, + metadata: { ground_truth: group.ground_truth_answer }, + trace_scores: group.scores?.[idx] ?? [] + })) + ); + } // Convert TraceItem[] to IndividualScore[] return score.traces.map(trace => ({ trace_id: trace.trace_id, input: { question: trace.question }, output: { answer: trace.llm_answer }, metadata: { ground_truth: trace.ground_truth_answer }, trace_scores: trace.scores })); }

Based on learnings: Use type guards like `isNewScoreObjectV2()` and `isLegacyScoreObject()` for runtime type checking when working with union types.

🤖 Prompt for AI Agents

In `@app/components/types.ts` around lines 59 - 61, normalizeToIndividualScores currently assumes NewScoreObjectV2.traces are TraceItem and will emit nested trace_scores when traces are grouped; update normalizeToIndividualScores to detect grouped format using an isGroupedFormat guard (or use existing isNewScoreObjectV2()/isLegacyScoreObject() runtime checks), and when traces are GroupedTraceItem[] flatten them into individual TraceItem entries before producing trace_scores (or skip/return safely if flattening isn't applicable) so trace_scores remains a flat array.

vprashrex self-assigned this Jan 29, 2026

vprashrex added the enhancement New feature or request label Jan 29, 2026

vprashrex linked an issue Jan 29, 2026 that may be closed by this pull request

Evaluation UI: Export Results #7

Closed

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

Prajna1999 approved these changes Jan 29, 2026

View reviewed changes

AkhileshNegi changed the title ~~feat: add support for grouped evaluation results and CSV export format selection~~ Evaluation: Group Export Jan 29, 2026

AkhileshNegi approved these changes Jan 29, 2026

View reviewed changes

AkhileshNegi merged commit 77ceacf into main Jan 29, 2026
1 check passed

coderabbitai bot mentioned this pull request Jan 30, 2026

fix: show toast notification when grouped export format is unavailable #29

Merged

This was referenced Mar 9, 2026

Evals UI: fixing score and config display #56

Merged

Guardrails UI: include in config editor #61

Closed

evals UI: Include comments in grouped question CSV #68

Merged

coderabbitai bot mentioned this pull request Mar 16, 2026

UI overhaul #69

Merged

Ayush8923 deleted the feat/evaluation-grouped branch March 20, 2026 11:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation: Group Export#25

Evaluation: Group Export#25
AkhileshNegi merged 1 commit intomainfrom
feat/evaluation-grouped

vprashrex commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 29, 2026

Uh oh!

coderabbitai bot Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vprashrex commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vprashrex commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading