UI overhaul by AkhileshNegi · Pull Request #69 · ProjectTech4DevAI/kaapi-frontend

AkhileshNegi · 2026-03-16T07:29:45Z

Summary

New Features

Text-to-Speech (TTS) Evaluation — fully new screen

Create datasets with text samples for TTS evaluation
Run TTS evaluations and view results
Listen to generated audio with a built-in audio player (play/pause, progress bar, file size info)
Leave feedback and comments on individual TTS results

Dataset descriptions — you can now add a description when creating a dataset (Text and TTS)
View & edit dataset samples (Speech-to-Text) — you can now open a dataset to see its individual samples, and update the ground
truth text or language for each sample inline
CSV validation on upload — when uploading a CSV for text evaluation, the system now checks upfront that the file has exactly the
required "question" and "answer" columns, and shows a clear error if not
Dataset creation moved inline — instead of a separate popup modal, creating a dataset is now done directly on the left panel of the
Datasets tab, making the flow smoother
Missing API key warning — if no API key is configured, the Speech-to-Text page now shows a helpful message with a link to the
Keystore page, instead of silently failing

Visual / UI Improvements

Consistent color system across all screens — all pages and components now use the same centralized color palette instead of
hardcoded values, giving a uniform look and feel
Cleaner tab navigation — tabs across all screens (Text Eval, STT, TTS) now use the same shared component with consistent styling,
smaller padding, and smoother transitions
Improved results table layout — the detailed results table now groups each question into two rows: one for the text (question,
ground truth, answer) and one for the scores below, making it easier to read
Compact configuration selector — the config dropdown is now more space-efficient with a simpler single-line display
Long dataset descriptions collapse — descriptions over 100 characters are automatically truncated with a "Show more" toggle
Thinner progress bars — evaluation progress indicators are now slimmer and less distracting
Consistent "Run Evaluation" button text — all screens now use the same label for starting an evaluation
Text Evaluation comment tooltips fixed — score comments in the results table used to appear on hover, which was unreliable (could get cut off or positioned offscreen). They now open on click with proper positioning

Summary by CodeRabbit

New Features
- Server proxy endpoints for dataset operations (GET/DELETE/PATCH) with API-key validation and optional CSV content fetching.
- Text-to-Speech page expanded with dataset/evaluation management, dataset viewing/download, and result editing.
- Speech-to-Text page enhanced with a two-pane datasets/evaluations workflow, per-sample editing, and richer run cards.
Refactor
- Streamlined configuration modal and compact config selector.
- Reworked results tables with fixed-position comment tooltips and improved export/CSV flows.
- Score display converted to chip-style.
UX
- Sidebar reorganized with Evaluations, Documents, and Knowledge Base entries.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (8)

app/text-to-speech/page.tsx (3)

815-837: ⚠️ Potential issue | 🟠 Major

CSV parsing still breaks quoted multiline fields.

Line 816 splits on \n before quote-aware parsing, which corrupts CSV rows containing embedded newlines.

💡 Suggested fix

-      // Parse CSV
-      const lines = csvText.split('\n').filter((l: string) => l.trim());
-      const parseRow = (line: string): string[] => {
-        const result: string[] = [];
-        let current = '';
-        let inQuotes = false;
-        for (let i = 0; i < line.length; i++) {
-          if (line[i] === '"') {
-            if (inQuotes && line[i + 1] === '"') { current += '"'; i++; }
-            else { inQuotes = !inQuotes; }
-          } else if (line[i] === ',' && !inQuotes) {
-            result.push(current.trim());
-            current = '';
-          } else {
-            current += line[i];
-          }
-        }
-        result.push(current.trim());
-        return result;
-      };
-
-      const headers = lines.length > 0 ? parseRow(lines[0]) : [];
-      const rows = lines.slice(1).map(parseRow);
+      // Parse CSV (quote-aware, supports multiline fields)
+      const parseCsv = (text: string): string[][] => {
+        const parsedRows: string[][] = [];
+        let row: string[] = [];
+        let cell = '';
+        let inQuotes = false;
+
+        for (let i = 0; i < text.length; i++) {
+          const ch = text[i];
+          const next = text[i + 1];
+
+          if (ch === '"') {
+            if (inQuotes && next === '"') {
+              cell += '"';
+              i++;
+            } else {
+              inQuotes = !inQuotes;
+            }
+            continue;
+          }
+
+          if (ch === ',' && !inQuotes) {
+            row.push(cell.trim());
+            cell = '';
+            continue;
+          }
+
+          if ((ch === '\n' || ch === '\r') && !inQuotes) {
+            if (ch === '\r' && next === '\n') i++;
+            row.push(cell.trim());
+            if (row.some(col => col.length > 0)) parsedRows.push(row);
+            row = [];
+            cell = '';
+            continue;
+          }
+
+          cell += ch;
+        }
+
+        row.push(cell.trim());
+        if (row.some(col => col.length > 0)) parsedRows.push(row);
+        return parsedRows;
+      };
+
+      const parsed = parseCsv(csvText);
+      const headers = parsed[0] ?? [];
+      const rows = parsed.slice(1);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/text-to-speech/page.tsx` around lines 815 - 837, The current logic splits
csvText by '\n' before parsing which breaks quoted multiline fields; instead,
implement a quote-aware line splitter or use a proper CSV parser so parseRow
receives complete logical records (not pre-split physical lines). Modify the
code around csvText.split('\n') and parseRow: either replace the split with a
function that iterates through csvText character-by-character, tracking inQuotes
and accumulating full records (pushing a record only when a newline is
encountered while inQuotes is false), or swap in a CSV library that returns
headers and rows directly; ensure parseRow is passed complete records and keep
its quote-handling for embedded commas/back-to-back quotes.

314-318: ⚠️ Potential issue | 🟠 Major

Preserve the user-selected language across reloads.

Line 317 always resets datasetLanguageId to the first language, so tab switches/reloads can silently change the user’s selection.

💡 Suggested fix

       if (languagesList.length > 0) {
         setLanguages(languagesList);
-        if (languagesList[0]?.id) {
-          setDatasetLanguageId(languagesList[0].id);
-        }
+        setDatasetLanguageId(prev =>
+          languagesList.some(language => language.id === prev)
+            ? prev
+            : (languagesList[0]?.id ?? prev)
+        );
       } else {
         setLanguages(DEFAULT_LANGUAGES);
+        setDatasetLanguageId(prev =>
+          DEFAULT_LANGUAGES.some(language => language.id === prev)
+            ? prev
+            : (DEFAULT_LANGUAGES[0]?.id ?? prev)
+        );
       }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/text-to-speech/page.tsx` around lines 314 - 318, When populating
languages from languagesList, avoid unconditionally overwriting the user’s
current selection: call setLanguages(languagesList) as before but only call
setDatasetLanguageId(languagesList[0].id) when there is no existing
datasetLanguageId (or no saved selection). In other words, change the logic
around setDatasetLanguageId to check the current datasetLanguageId state (or a
persisted value in localStorage) and only default to languagesList[0].id if none
exists; update the component mount/initialization path to restore a saved
datasetLanguageId if present.

1229-1256: ⚠️ Potential issue | 🟠 Major

Guard against out-of-order feedback PATCH responses.

updateFeedback can apply stale responses after newer edits (same resultId), causing state rollback.

💡 Suggested fix

 function EvaluationsTab({
@@
 }: EvaluationsTabProps) {
   const [playingResultId, setPlayingResultId] = useState<number | null>(null);
   const [openScoreInfo, setOpenScoreInfo] = useState<string | null>(null);
   const [scoreInfoPos, setScoreInfoPos] = useState<{ top: number; left: number }>({ top: 0, left: 0 });
+  const feedbackVersionRef = useRef<Record<number, number>>({});
@@
   const updateFeedback = async (resultId: number, isCorrect: boolean | null, comment?: string, score?: { [key: string]: any }) => {
     if (apiKeys.length === 0) return;
+    const version = (feedbackVersionRef.current[resultId] ?? 0) + 1;
+    feedbackVersionRef.current[resultId] = version;
@@
       if (!response.ok) throw new Error('Failed to update feedback');
 
       setResults(prev => prev.map(r =>
-        r.id === resultId ? {
+        r.id === resultId && feedbackVersionRef.current[resultId] === version ? {
           ...r,
           ...(isCorrect !== undefined ? { is_correct: isCorrect } : {}),
           ...(comment !== undefined && { comment }),
           ...(score !== undefined && { score: { ...(r.score || {}), ...score } }),
         } : r
       ));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/text-to-speech/page.tsx` around lines 1229 - 1256, The PATCH response
handler can apply stale updates if earlier requests resolve after later ones;
fix by tracking a per-result request sequence (e.g., a useRef map like
feedbackSeqRef.current[resultId]) that you increment just before calling fetch
in updateFeedback, capture that sequence value in the request closure, and when
the response arrives only call setResults/update state if the captured sequence
equals the current feedbackSeqRef.current[resultId]; reference updateFeedback,
setResults and resultId when adding this request-sequencing guard so
out-of-order responses are ignored.

app/evaluations/page.tsx (4)

123-130: ⚠️ Potential issue | 🟠 Major

Strip BOM before CSV header validation.

Line 123 can treat the first header as \uFEFFquestion, which rejects valid CSV exports from spreadsheet tools.

Proposed fix

-      const firstLine = text.split('\n')[0]?.trim();
+      const firstLine = text.split('\n')[0]?.replace(/^\uFEFF/, '').trim();

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 123 - 130, The CSV header check fails
when the first header has a leading BOM (e.g. \uFEFFquestion); update the logic
around firstLine and headers in this handler to strip any BOM before
trimming/splitting (remove leading \uFEFF from firstLine and from the first
header token if present) so required = ['question','answer'] validation
succeeds; adjust where firstLine is computed and where headers is derived (the
variables named firstLine and headers in this file) to normalize/remove BOM
before running the required header check.

1018-1020: ⚠️ Potential issue | 🟠 Major

Gate Run Evaluation on existing dataset object, not stale id string.

Line 1019 allows canRun when selectedDatasetId exists even if that dataset was deleted.

Proposed fix

-  const canRun = experimentName.trim() && selectedDatasetId && selectedConfigId && selectedConfigVersion && !isEvaluating;
+  const canRun = !!(
+    experimentName.trim() &&
+    selectedDataset &&
+    selectedConfigId &&
+    selectedConfigVersion &&
+    !isEvaluating
+  );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 1018 - 1020, The canRun predicate
currently checks only that selectedDatasetId is truthy, which allows running
when the id no longer maps to an existing dataset; update the logic to require
the resolved selectedDataset object (the result of storedDatasets.find) to be
non-undefined instead of just selectedDatasetId, e.g., compute selectedDataset
from storedDatasets and change the canRun expression (used where canRun is
defined) to use selectedDataset (and the other existing checks like
experimentName.trim(), selectedConfigId, selectedConfigVersion, and
!isEvaluating) so runs are gated on an actual existing dataset object.

508-530: ⚠️ Potential issue | 🟠 Major

CSV modal parser still breaks quoted multiline cells.

Lines 509-530 split by \n before parsing, so valid quoted fields containing newlines get corrupted in view/download.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 508 - 530, The CSV parser currently
splits the file by '\n' into lines (variable lines) before calling parseRow,
which corrupts quoted fields that contain embedded newlines; update the logic in
this block (remove the pre-splitting by '\n' and/or replace it) so the entire
csvText is parsed as a single stream and row boundaries are determined by an
algorithm that respects quotes (or simply swap in a robust CSV parser library).
Specifically, revise the usage around parseRow, headers and rows so parseRow
operates on a full-stream parser that handles quoted multiline fields (or call a
library such as PapaParse/CSV-Parse) and then extract headers and rows from that
proper parse result instead of using csvText.split('\n').

259-264: ⚠️ Potential issue | 🟠 Major

Refresh evaluation runs immediately after successful run creation.

After Line 259 success, the new run is not fetched, so users don’t see it until manual refresh.

Proposed fix

-interface EvaluationsTabProps {
+interface EvaluationsTabProps {
   ...
-  handleRunEvaluation: () => void;
+  handleRunEvaluation: () => Promise<void>;
   ...
 }

 function EvaluationsTab({
   ...
 }: EvaluationsTabProps) {
+  const runAndRefresh = async () => {
+    await handleRunEvaluation();
+    await fetchEvaluations();
+  };
   ...
   <button
-    onClick={handleRunEvaluation}
+    onClick={runAndRefresh}
     disabled={!canRun}

Also applies to: 1205-1207

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 259 - 264, After creating the
evaluation and before setting isEvaluating=false/toast, trigger the code path
that refreshes the runs list so the newly-created run appears immediately: call
the existing run-refresh function (e.g., fetchEvaluationRuns(),
loadEvaluationRuns(), or the SWR/mutate/revalidate function you use) right after
you parse the response (where variables data and evalId are available) and
before or immediately after setIsEvaluating(false); ensure you use the exact
function name your page uses to fetch runs so both this occurrence and the
similar block at the other location (lines referenced in the comment) call it to
re-fetch the evaluation runs.

app/speech-to-text/page.tsx (1)

1048-1050: ⚠️ Potential issue | 🟠 Major

Dataset viewer still truncates to the first 100 samples.

Line [1049] hardcodes sample_limit=100&sample_offset=0, so large datasets are still only partially viewable/editable while the modal looks complete. Add pagination or an explicit “showing first N” + load-more flow.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/speech-to-text/page.tsx` around lines 1048 - 1050, The dataset fetch
currently hardcodes `sample_limit=100&sample_offset=0` in the GET request (see
the fetch call using `datasetId` and `apiKeys[0].key`), which truncates large
datasets; update the dataset viewer to support pagination or a load-more flow by
replacing the hardcoded params with state-driven `sample_limit` and
`sample_offset`/`page` values, add a UI indicator like “showing first N” and a
Load More button that increments offset and appends new samples to the existing
list, and ensure the fetch logic (where the fetch URL is built) and any modal
rendering consume and display the full paginated result instead of only the
first 100.

🧹 Nitpick comments (2)

app/speech-to-text/page.tsx (2)

1522-1540: Avoid eagerly mounting an audio player for every sample row.

Rendering AudioPlayerFromUrl for each row can trigger many metadata fetches at modal open (especially with large sample lists), causing avoidable UI/network overhead.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/speech-to-text/page.tsx` around lines 1522 - 1540, The current map over
viewModalData.samples eagerly mounts AudioPlayerFromUrl for every sample,
causing wasted metadata fetches; change the rendering logic in the table row
(the map callback that references viewPlayingId, setViewPlayingId and
AudioPlayerFromUrl) so that AudioPlayerFromUrl is only mounted for the active
sample (e.g., render a lightweight Play button/placeholder for each row and only
render AudioPlayerFromUrl when viewPlayingId === sample.id), keep using
isPlaying={viewPlayingId === sample.id} and toggle viewPlayingId in the button's
onClick to mount/unmount the component.

1429-1429: Prefer palette tokens over new hardcoded hex colors.

Line [1429] and Line [2297] use #DCCFC3, which bypasses the centralized color system introduced in this PR.

Also applies to: 2295-2298

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/speech-to-text/page.tsx` at line 1429, Replace the hardcoded hex
'#DCCFC3' used in the inline style (the JSX element that already references
colors.bg.primary and boxShadow) with the appropriate centralized palette token
from the shared colors object (e.g., a border/accent token such as
colors.border.warning or the matching token in your palette); update both
occurrences in this file so the inline style uses colors.<token> instead of
'#DCCFC3' to keep color usage consistent with the project's design system.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/evaluations/page.tsx`:
- Around line 85-91: In loadStoredDatasets and the other dataset fetch/create
flows, you're currently using apiKeys[0].key which can pick the wrong tenant;
instead resolve the selected key via selectedKeyId (e.g., find in apiKeys by id)
and use that key.value/.key for the 'X-API-KEY' header everywhere (replace
occurrences of apiKeys[0].key). Update functions loadStoredDatasets and the
dataset create/fetch handlers to consistently derive the active key from
selectedKeyId and fail early with a clear error/log if the selected key is
missing.
- Around line 279-287: The menu-toggle button (the icon-only button that calls
setSidebarCollapsed(!sidebarCollapsed)) and all other icon-only controls (e.g.,
remove file, close modal, refresh runs) need accessible names so screen readers
announce them; add an appropriate aria-label or aria-labelledby to each
icon-only <button> (for this one use a label like "Toggle sidebar" tied to the
button) and ensure the label is descriptive and localized where applicable,
keeping the existing onClick/props intact; apply the same fix to the other
icon-only buttons referenced (remove file, close modal, refresh runs).

In `@app/speech-to-text/page.tsx`:
- Around line 1165-1174: Replace the clickable <span> controls used for the
language-info popover with semantic buttons: change the element in the handler
that calls setLanguageInfoPos and toggles setShowLanguageInfo (and the second
instance at the other location) to <button type="button">, keep the existing
className and inline style, preserve the onClick logic (including
e.stopPropagation()/e.preventDefault() and getBoundingClientRect usage), and add
a descriptive aria-label (e.g., "Show language information") so screen readers
announce it; this makes the control keyboard-accessible (Enter/Space) without
altering behavior in setLanguageInfoPos, setShowLanguageInfo or
showLanguageInfo.

---

Duplicate comments:
In `@app/evaluations/page.tsx`:
- Around line 123-130: The CSV header check fails when the first header has a
leading BOM (e.g. \uFEFFquestion); update the logic around firstLine and headers
in this handler to strip any BOM before trimming/splitting (remove leading
\uFEFF from firstLine and from the first header token if present) so required =
['question','answer'] validation succeeds; adjust where firstLine is computed
and where headers is derived (the variables named firstLine and headers in this
file) to normalize/remove BOM before running the required header check.
- Around line 1018-1020: The canRun predicate currently checks only that
selectedDatasetId is truthy, which allows running when the id no longer maps to
an existing dataset; update the logic to require the resolved selectedDataset
object (the result of storedDatasets.find) to be non-undefined instead of just
selectedDatasetId, e.g., compute selectedDataset from storedDatasets and change
the canRun expression (used where canRun is defined) to use selectedDataset (and
the other existing checks like experimentName.trim(), selectedConfigId,
selectedConfigVersion, and !isEvaluating) so runs are gated on an actual
existing dataset object.
- Around line 508-530: The CSV parser currently splits the file by '\n' into
lines (variable lines) before calling parseRow, which corrupts quoted fields
that contain embedded newlines; update the logic in this block (remove the
pre-splitting by '\n' and/or replace it) so the entire csvText is parsed as a
single stream and row boundaries are determined by an algorithm that respects
quotes (or simply swap in a robust CSV parser library). Specifically, revise the
usage around parseRow, headers and rows so parseRow operates on a full-stream
parser that handles quoted multiline fields (or call a library such as
PapaParse/CSV-Parse) and then extract headers and rows from that proper parse
result instead of using csvText.split('\n').
- Around line 259-264: After creating the evaluation and before setting
isEvaluating=false/toast, trigger the code path that refreshes the runs list so
the newly-created run appears immediately: call the existing run-refresh
function (e.g., fetchEvaluationRuns(), loadEvaluationRuns(), or the
SWR/mutate/revalidate function you use) right after you parse the response
(where variables data and evalId are available) and before or immediately after
setIsEvaluating(false); ensure you use the exact function name your page uses to
fetch runs so both this occurrence and the similar block at the other location
(lines referenced in the comment) call it to re-fetch the evaluation runs.

In `@app/speech-to-text/page.tsx`:
- Around line 1048-1050: The dataset fetch currently hardcodes
`sample_limit=100&sample_offset=0` in the GET request (see the fetch call using
`datasetId` and `apiKeys[0].key`), which truncates large datasets; update the
dataset viewer to support pagination or a load-more flow by replacing the
hardcoded params with state-driven `sample_limit` and `sample_offset`/`page`
values, add a UI indicator like “showing first N” and a Load More button that
increments offset and appends new samples to the existing list, and ensure the
fetch logic (where the fetch URL is built) and any modal rendering consume and
display the full paginated result instead of only the first 100.

In `@app/text-to-speech/page.tsx`:
- Around line 815-837: The current logic splits csvText by '\n' before parsing
which breaks quoted multiline fields; instead, implement a quote-aware line
splitter or use a proper CSV parser so parseRow receives complete logical
records (not pre-split physical lines). Modify the code around
csvText.split('\n') and parseRow: either replace the split with a function that
iterates through csvText character-by-character, tracking inQuotes and
accumulating full records (pushing a record only when a newline is encountered
while inQuotes is false), or swap in a CSV library that returns headers and rows
directly; ensure parseRow is passed complete records and keep its quote-handling
for embedded commas/back-to-back quotes.
- Around line 314-318: When populating languages from languagesList, avoid
unconditionally overwriting the user’s current selection: call
setLanguages(languagesList) as before but only call
setDatasetLanguageId(languagesList[0].id) when there is no existing
datasetLanguageId (or no saved selection). In other words, change the logic
around setDatasetLanguageId to check the current datasetLanguageId state (or a
persisted value in localStorage) and only default to languagesList[0].id if none
exists; update the component mount/initialization path to restore a saved
datasetLanguageId if present.
- Around line 1229-1256: The PATCH response handler can apply stale updates if
earlier requests resolve after later ones; fix by tracking a per-result request
sequence (e.g., a useRef map like feedbackSeqRef.current[resultId]) that you
increment just before calling fetch in updateFeedback, capture that sequence
value in the request closure, and when the response arrives only call
setResults/update state if the captured sequence equals the current
feedbackSeqRef.current[resultId]; reference updateFeedback, setResults and
resultId when adding this request-sequencing guard so out-of-order responses are
ignored.

---

Nitpick comments:
In `@app/speech-to-text/page.tsx`:
- Around line 1522-1540: The current map over viewModalData.samples eagerly
mounts AudioPlayerFromUrl for every sample, causing wasted metadata fetches;
change the rendering logic in the table row (the map callback that references
viewPlayingId, setViewPlayingId and AudioPlayerFromUrl) so that
AudioPlayerFromUrl is only mounted for the active sample (e.g., render a
lightweight Play button/placeholder for each row and only render
AudioPlayerFromUrl when viewPlayingId === sample.id), keep using
isPlaying={viewPlayingId === sample.id} and toggle viewPlayingId in the button's
onClick to mount/unmount the component.
- Line 1429: Replace the hardcoded hex '#DCCFC3' used in the inline style (the
JSX element that already references colors.bg.primary and boxShadow) with the
appropriate centralized palette token from the shared colors object (e.g., a
border/accent token such as colors.border.warning or the matching token in your
palette); update both occurrences in this file so the inline style uses
colors.<token> instead of '#DCCFC3' to keep color usage consistent with the
project's design system.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8078976b-f304-4ccc-8447-5c37e2f48560

📥 Commits

Reviewing files that changed from the base of the PR and between d130865 and 0901ab3.

📒 Files selected for processing (3)

app/evaluations/page.tsx
app/speech-to-text/page.tsx
app/text-to-speech/page.tsx

app/evaluations/page.tsx

coderabbitai · 2026-03-17T04:31:07Z

app/speech-to-text/page.tsx

+                <span
+                  className="inline-flex items-center justify-center w-3.5 h-3.5 rounded-full text-[9px] font-normal cursor-pointer shrink-0"
+                  style={{ backgroundColor: colors.bg.primary, border: `1px solid ${colors.border}`, color: colors.text.secondary }}
+                  onClick={(e) => {
+                    e.stopPropagation();
+                    e.preventDefault();
+                    const rect = e.currentTarget.getBoundingClientRect();
+                    setLanguageInfoPos({ top: rect.bottom + 4, left: rect.left });
+                    setShowLanguageInfo(!showLanguageInfo);
                  }}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

wc -l app/speech-to-text/page.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 108

🏁 Script executed:

sed -n '1160,1180p' app/speech-to-text/page.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 1168

🏁 Script executed:

sed -n '1957,1975p' app/speech-to-text/page.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 1804

Use semantic <button> elements instead of clickable <span> for accessible interactive controls.

Lines 1165–1174 and 1962–1969 use clickable <span> elements with onClick handlers for interactive info popovers. These are not keyboard or screen-reader accessible and violate WCAG 2.1 Level A requirements for semantic HTML.

Convert to <button type="button"> with an aria-label attribute:

Span elements don't announce as interactive to screen readers

Keyboard users cannot access these controls (no focus/Enter support)

Both instances require the same fix

Suggested fix pattern

-<span +<button + type="button" + aria-label="Show help" className="inline-flex items-center justify-center w-3.5 h-3.5 rounded-full text-[9px] font-normal cursor-pointer shrink-0" style={{ backgroundColor: colors.bg.primary, border: `1px solid ${colors.border}`, color: colors.text.secondary }} onClick={(e) => { e.stopPropagation(); e.preventDefault(); // existing logic }} > i -</span> +</button>

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/speech-to-text/page.tsx` around lines 1165 - 1174, Replace the clickable <span> controls used for the language-info popover with semantic buttons: change the element in the handler that calls setLanguageInfoPos and toggles setShowLanguageInfo (and the second instance at the other location) to <button type="button">, keep the existing className and inline style, preserve the onClick logic (including e.stopPropagation()/e.preventDefault() and getBoundingClientRect usage), and add a descriptive aria-label (e.g., "Show language information") so screen readers announce it; this makes the control keyboard-accessible (Enter/Space) without altering behavior in setLanguageInfoPos, setShowLanguageInfo or showLanguageInfo.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

app/evaluations/[id]/page.tsx (2)
102-108: Consider explicit handling when data.success === false.

When the API returns success: false, the code shows a toast and resets export format but then continues to process data.data || data. If this combination is intentional (graceful degradation), a brief comment would clarify intent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx around lines 102 - 108, The branch that
handles data.success === false currently shows a toast and calls
setExportFormat('row') but then continues to compute foundJob = data.data ||
data and proceed; either make this an explicit early return (e.g., after
toast/error handling stop further processing) or add a clarifying comment if
continuing is intentional. Update the block around the data.success check (the
toast.error and setExportFormat calls) to either return/throw after handling the
error so foundJob is not used on a failed response, or add a one-line comment
explaining why you still intend to use data.data || data and ensure any
downstream code that uses foundJob (the throw new Error('Evaluation job not
found') path) is still correct.
418-453: Toggle button styling may appear unintuitive to users.

Per reviewer feedback, the "Group by Questions" button appears greyed out when not selected. The current styling uses colors.text.secondary for unselected state, which can look disabled. Consider adding a subtle background or hover state to clarify these are clickable toggles.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx around lines 418 - 453, The unselected toggle
buttons (handlers using setExportFormat and reading exportFormat) look disabled
because they use colors.text.secondary; update the toggle styling so unselected
state uses a clearer interactive style — e.g., set a subtle neutral background
(using colors.bg.tertiary or a low-opacity variant of colors.bg.primary), ensure
text uses a readable color (not colors.text.secondary if it appears disabled),
and add a hover/focus state (slight background darken and boxShadow or outline)
to both buttons so "Individual Rows" and "Group by Questions" clearly appear
clickable while preserving the existing selected styles.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/evaluations/`[id]/page.tsx:
- Around line 354-356: Replace the hardcoded HSL error color used in the <p
className="text-sm mb-4"> that renders {error || 'Evaluation job not found'}
(and the same instance later in the file) with a centralized palette value: add
an error color entry to the shared colors palette (e.g., colors.error or
colors.red[...]) and then import and use that value in the component style
instead of 'hsl(8, 86%, 40%)'; update both occurrences in
app/evaluations/[id]/page.tsx to reference the new palette key so error styling
is consistent across the app.
- Around line 183-201: The CSV header generation in csvContent only adds one
column per score name, but the row-building loop in traces (using variables
group, llm_answers, trace_ids and score lookup via group.scores[i]?.find)
appends two values per score (value and comment), causing column misalignment;
update the header generation where scoreNames is iterated to add two columns per
score (e.g., `${name} (${i})` and `${name} (${i}) Comment` or similar) so the
header matches the row output produced by the score value and score.comment
pushes, keeping use of sanitizeCSVCell for comment headers consistent with
existing CSV sanitization.

---

Nitpick comments:
In `@app/evaluations/`[id]/page.tsx:
- Around line 102-108: The branch that handles data.success === false currently
shows a toast and calls setExportFormat('row') but then continues to compute
foundJob = data.data || data and proceed; either make this an explicit early
return (e.g., after toast/error handling stop further processing) or add a
clarifying comment if continuing is intentional. Update the block around the
data.success check (the toast.error and setExportFormat calls) to either
return/throw after handling the error so foundJob is not used on a failed
response, or add a one-line comment explaining why you still intend to use
data.data || data and ensure any downstream code that uses foundJob (the throw
new Error('Evaluation job not found') path) is still correct.
- Around line 418-453: The unselected toggle buttons (handlers using
setExportFormat and reading exportFormat) look disabled because they use
colors.text.secondary; update the toggle styling so unselected state uses a
clearer interactive style — e.g., set a subtle neutral background (using
colors.bg.tertiary or a low-opacity variant of colors.bg.primary), ensure text
uses a readable color (not colors.text.secondary if it appears disabled), and
add a hover/focus state (slight background darken and boxShadow or outline) to
both buttons so "Individual Rows" and "Group by Questions" clearly appear
clickable while preserving the existing selected styles.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5a8013f3-a723-444e-a901-a39cf58ca652

📥 Commits

Reviewing files that changed from the base of the PR and between 0901ab3 and 51a8f83.

📒 Files selected for processing (1)

app/evaluations/[id]/page.tsx

app/evaluations/[id]/page.tsx

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

app/components/ConfigSelector.tsx (1)
220-235: ⚠️ Potential issue | 🟡 Minor

Search input can be obscured by the outside-click overlay.

The relative container at line 220 lacks a z-index, while the outside-click overlay at lines 403-408 has z-40. This means clicks on the search input may be intercepted by the overlay, closing the dropdown unexpectedly.
Suggested fix
-          <div className="relative">
+          <div className="relative z-50">
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/ConfigSelector.tsx` around lines 220 - 235, The search input
is being blocked by the outside-click overlay (overlay uses z-40); update the
dropdown root container (the div wrapping the conditional when isDropdownOpen)
or the input element used with searchQuery/setSearchQuery so it has a higher
stacking context (e.g., add a z-index > 40 via class or style such as z-50)
ensuring the input receives clicks; adjust the element containing
isDropdownOpen/rendered input (the "relative" div) to include the z-index
change.

🧹 Nitpick comments (1)

app/components/ConfigSelector.tsx (1)

153-165: Consider adding keyboard focus styles for accessibility.

The button uses onMouseEnter/onMouseLeave for hover effects, but keyboard users won't see any visual feedback when the button is focused. This may also contribute to the PR feedback that the button "appears greyed out" since it lacks clear interactive affordances.

Suggested improvement

             className="px-3 py-1.5 rounded-md text-xs font-medium transition-colors"
+            // Consider adding focus:ring or focus:outline styles via Tailwind
             style={{
               backgroundColor: colors.bg.primary,
               border: `1px solid ${colors.border}`,
               color: colors.text.primary,
             }}
+            onFocus={(e) => e.currentTarget.style.backgroundColor = colors.bg.secondary}
+            onBlur={(e) => e.currentTarget.style.backgroundColor = colors.bg.primary}
             onMouseEnter={(e) => e.currentTarget.style.backgroundColor = colors.bg.secondary}
             onMouseLeave={(e) => e.currentTarget.style.backgroundColor = colors.bg.primary}

Alternatively, use Tailwind's focus: and hover: utilities with CSS variables for a cleaner approach.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/components/ConfigSelector.tsx` around lines 153 - 165, The Browse Library
button in ConfigSelector.tsx (the element binding handleBrowseLibrary) only
changes styles on mouse events, so keyboard users get no focus feedback; add
keyboard-focus styles by either replacing the inline mouse handlers with
className utilities (e.g., Tailwind's hover: and focus: classes) that reference
your color variables, or add onFocus and onBlur handlers that mirror
onMouseEnter/onMouseLeave to set e.currentTarget.style.backgroundColor to
colors.bg.secondary and revert to colors.bg.primary on blur; ensure the button
also has a visible focus outline (e.g., outline or boxShadow) and keep the
existing onClick=handleBrowseLibrary intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/components/Sidebar.tsx`:
- Around line 58-59: The persisted sidebar state under the storage key
"sidebar-expanded-menus" can lose expansion for renamed top-level menu
"Evaluations"; update the Sidebar component's load/save logic to perform a small
migration: when reading the stored JSON for "sidebar-expanded-menus" detect
legacy keys (e.g., old name(s) previously used for the evaluations menu) and
remap them to "Evaluations" before using the state, then continue to save using
the new "Evaluations" key; implement this in the functions that parse/load the
stored state (and ensure the save path uses the new key) so existing users
retain expansion behavior.

---

Duplicate comments:
In `@app/components/ConfigSelector.tsx`:
- Around line 220-235: The search input is being blocked by the outside-click
overlay (overlay uses z-40); update the dropdown root container (the div
wrapping the conditional when isDropdownOpen) or the input element used with
searchQuery/setSearchQuery so it has a higher stacking context (e.g., add a
z-index > 40 via class or style such as z-50) ensuring the input receives
clicks; adjust the element containing isDropdownOpen/rendered input (the
"relative" div) to include the z-index change.

---

Nitpick comments:
In `@app/components/ConfigSelector.tsx`:
- Around line 153-165: The Browse Library button in ConfigSelector.tsx (the
element binding handleBrowseLibrary) only changes styles on mouse events, so
keyboard users get no focus feedback; add keyboard-focus styles by either
replacing the inline mouse handlers with className utilities (e.g., Tailwind's
hover: and focus: classes) that reference your color variables, or add onFocus
and onBlur handlers that mirror onMouseEnter/onMouseLeave to set
e.currentTarget.style.backgroundColor to colors.bg.secondary and revert to
colors.bg.primary on blur; ensure the button also has a visible focus outline
(e.g., outline or boxShadow) and keep the existing onClick=handleBrowseLibrary
intact.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3e214ecb-c234-4ddb-bb24-9c5576ba4914

📥 Commits

Reviewing files that changed from the base of the PR and between 51a8f83 and f5c2fa1.

📒 Files selected for processing (2)

app/components/ConfigSelector.tsx
app/components/Sidebar.tsx

coderabbitai · 2026-03-17T05:36:34Z

app/components/Sidebar.tsx

+      name: 'Evaluations',
      icon: (


⚠️ Potential issue | 🟡 Minor

Persisted sidebar state may regress after menu key rename

Because the top-level key is now Evaluations, users with older saved state (e.g., legacy keys) can lose expected expansion behavior after Line 58. Consider a small migration/fallback when loading sidebar-expanded-menus so old keys map to Evaluations.

Suggested patch

+ const DEFAULT_EXPANDED_MENUS: Record<string, boolean> = { + Evaluations: true, + Configurations: false, + }; + - const [expandedMenus, setExpandedMenus] = useState<Record<string, boolean>>({ - 'Evaluations': true, - 'Configurations': false, - }); + const [expandedMenus, setExpandedMenus] = useState<Record<string, boolean>>(DEFAULT_EXPANDED_MENUS); useEffect(() => { const saved = localStorage.getItem('sidebar-expanded-menus'); if (saved) { try { - setExpandedMenus(JSON.parse(saved)); + const parsed = JSON.parse(saved) as Record<string, boolean>; + setExpandedMenus({ + ...DEFAULT_EXPANDED_MENUS, + ...parsed, + Evaluations: parsed.Evaluations ?? parsed.Capabilities ?? DEFAULT_EXPANDED_MENUS.Evaluations, + }); } catch (e) { console.error('Failed to load sidebar state:', e); } } }, []);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/components/Sidebar.tsx` around lines 58 - 59, The persisted sidebar state under the storage key "sidebar-expanded-menus" can lose expansion for renamed top-level menu "Evaluations"; update the Sidebar component's load/save logic to perform a small migration: when reading the stored JSON for "sidebar-expanded-menus" detect legacy keys (e.g., old name(s) previously used for the evaluations menu) and remap them to "Evaluations" before using the state, then continue to save using the new "Evaluations" key; implement this in the functions that parse/load the stored state (and ensure the save path uses the new key) so existing users retain expansion behavior.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (9)

app/evaluations/page.tsx (5)

259-264: ⚠️ Potential issue | 🟠 Major

Refresh the runs list after a successful evaluation POST.

handleRunEvaluation() never updates evalJobs, so a newly created run stays invisible in the right pane until the user manually refreshes. Either lift fetchEvaluations/evalJobs up or pass a callback down so the success path can refresh immediately.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 259 - 264, handleRunEvaluation creates
a new evaluation but never refreshes the evalJobs list, so the new run doesn't
appear; update the success path in handleRunEvaluation to refresh the runs by
calling the existing fetchEvaluations function (or a passed-in refresh callback)
after the POST completes and before/after setIsEvaluating(false) and
toast.success. Locate the success branch in handleRunEvaluation (where
response.json() is parsed and toast.success is called) and invoke
fetchEvaluations() or the provided refresh callback to update the evalJobs state
so the right-pane list shows the new run immediately.

123-129: ⚠️ Potential issue | 🟠 Major

Strip the UTF-8 BOM before validating CSV headers.

Spreadsheet exports commonly start with \uFEFF, which makes the first header \uFEFFquestion here and rejects an otherwise valid question,answer file.

Suggested fix

-      const firstLine = text.split('\n')[0]?.trim();
+      const firstLine = text.split('\n')[0]?.replace(/^\uFEFF/, '').trim();

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 123 - 129, The CSV parsing currently
treats a leading UTF-8 BOM as part of the first header (see firstLine and
headers variables), causing valid files like "question,answer" to be rejected;
before checking/using firstLine and before splitting into headers, strip any
leading BOM (U+FEFF) from the string (e.g., remove /^\uFEFF/ from firstLine),
then continue trimming and lowercasing header tokens and keep the existing
empty-file handling that clears event.target.value. Ensure the BOM-stripped
value is used for both the empty check and the headers calculation.

508-530: ⚠️ Potential issue | 🟠 Major

The dataset viewer still corrupts quoted multiline CSV cells.

Splitting csvText on \n before honoring CSV quotes turns one logical record into multiple rows when a cell contains an embedded newline, and the reconstructed download preserves that broken shape.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 508 - 530, The current approach splits
csvText by '\n' into lines before calling parseRow, which breaks quoted
multiline cells; change the logic to parse csvText as a stream and build rows
while respecting quotes instead of pre-splitting: replace the lines/parseRow
usage with a single pass over csvText that accumulates characters into a current
field and current row, toggles inQuotes on '"' (handling '""' escapes), only
ends a field on unquoted commas and ends a row on unquoted newlines, then derive
headers from the first completed row and rows from subsequent completed rows;
update references to parseRow, lines, headers, and rows accordingly.

1019-1020: ⚠️ Potential issue | 🟠 Major

Gate Run Evaluation on the resolved dataset, not the raw id.

After a dataset is deleted or a stale dataset query param is loaded, selectedDatasetId can stay truthy while selectedDataset is undefined, and Line 239 will still POST that stale dataset id.

Suggested fix

-  const canRun = experimentName.trim() && selectedDatasetId && selectedConfigId && selectedConfigVersion && !isEvaluating;
+  const canRun = !!(
+    experimentName.trim() &&
+    selectedDataset &&
+    selectedConfigId &&
+    selectedConfigVersion &&
+    !isEvaluating
+  );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 1019 - 1020, The current run gate uses
selectedDatasetId which can be stale; change the gating and submission to depend
on the resolved selectedDataset object instead. Update the boolean canRun to
require selectedDataset (e.g., replace selectedDatasetId in the expression with
selectedDataset) and ensure any code that POSTs the dataset uses
selectedDataset.dataset_id (or the resolved id property) rather than the raw
selectedDatasetId so deleted/stale ids are not submitted.

692-703: ⚠️ Potential issue | 🟠 Major

Add accessible names to the icon-only controls.

These buttons currently announce as unlabeled controls. Please add aria-labels such as “Remove CSV”, “Close dataset viewer”, and “Refresh evaluations” so screen-reader users can operate them.

Also applies to: 875-883, 1261-1275

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/page.tsx` around lines 692 - 703, The icon-only buttons
(e.g., the delete button that calls onRemoveFile and clears fileInputRef.value)
lack accessible names; update these button elements to include descriptive
aria-label attributes (for example aria-label="Remove CSV" for the onRemoveFile
button) and do the same for other icon-only controls mentioned (close dataset
viewer, refresh evaluations) so screen readers announce their purpose; ensure
aria-labels match the button action and are added to the button elements
wrapping the SVGs (and any similar buttons at the other ranges).

app/speech-to-text/page.tsx (2)

1048-1050: ⚠️ Potential issue | 🟠 Major

The dataset viewer still hard-codes the first 100 samples.

This modal is presented as the dataset view/editor, but the request silently truncates anything past the first page. Please add pagination/load-more state or make the partial scope explicit in the UI.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/speech-to-text/page.tsx` around lines 1048 - 1050, The dataset viewer
currently hard-codes fetching the first 100 samples via the fetch call that
includes include_samples=true&sample_limit=100&sample_offset=0 (using datasetId
and apiKeys[0].key); update the component to support pagination or an explicit
partial-scope indicator: introduce state for sampleLimit and sampleOffset (or a
nextPage token), replace the hard-coded query params in the fetch with those
state values, add UI controls (Load more / page controls) that update
sampleOffset and re-fetch, and/or show a clear badge/message in the modal
indicating "Showing first 100 samples" when you opt not to paginate; ensure the
fetch call and its surrounding handler (the code that constructs the URL and
processes response) reference the new state so additional pages load correctly.

1165-1177: ⚠️ Potential issue | 🟡 Minor

Replace the clickable help <span>s with buttons.

These popover triggers are not keyboard or screen-reader accessible as spans. Use <button type="button"> with an aria-label, and open the popover on focus/hover as well as click for the dataset-language help and the score-metric help.

Also applies to: 1981-1994

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/speech-to-text/page.tsx` around lines 1165 - 1177, Replace the clickable
<span> popover triggers with semantic buttons: change the span to <button
type="button"> (preserve className, inline style, and existing onClick logic
including stopPropagation/preventDefault if still needed) and add a descriptive
aria-label (e.g., "Open language help"). Also wire up keyboard and hover/focus
interactions by calling setLanguageInfoPos(...) and setShowLanguageInfo(true)
from onFocus and onMouseEnter, and closing the popover on onBlur and
onMouseLeave (setShowLanguageInfo(false)); apply the same replacement and
handlers for the corresponding score-metric help instance (the other block
around the setShowLanguageInfo / setLanguageInfoPos usage at ~1981-1994).

app/text-to-speech/page.tsx (2)

1230-1257: ⚠️ Potential issue | 🟠 Major

Serialize feedback PATCHes per result.

The row handlers update results optimistically, and updateFeedback() reapplies the same fields again when each PATCH resolves. Two quick edits to the same result can resolve out of order and let the older response overwrite the newer choice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/text-to-speech/page.tsx` around lines 1230 - 1257, The updateFeedback
function is vulnerable to out-of-order PATCH responses overwriting newer
optimistic updates; implement per-result serialization by tracking a per-result
pending update token or promise queue keyed by resultId (e.g., a Map<number,
Promise> or Map<number, number> version counter) and use it inside
updateFeedback to ensure each PATCH is sent and applied in order — only apply
the response/update to setResults for a result when the token/version matches
the latest for that resultId; keep references to updateFeedback, resultId,
setResults and results when wiring the token/queue logic so older responses are
ignored.
815-837: ⚠️ Potential issue | 🟠 Major

The dataset viewer/parser still breaks quoted multiline CSV cells.

csvText.split('\n') will split valid quoted samples containing embedded line breaks into multiple rows, so both the preview and the CSV download can corrupt real datasets.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/text-to-speech/page.tsx` around lines 815 - 837, The code currently uses
csvText.split('\n') which breaks quoted multiline CSV fields; instead implement
a splitIntoRecords function that scans csvText character-by-character, toggles
inQuotes on double quotes, and only treats newline characters as record
separators when not inQuotes, then feed those records into the existing
parseRow; replace the lines assignment and subsequent headers/rows usage to use
splitIntoRecords(csvText) so parseRow continues to parse fields but no longer
receives broken multiline cells.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/evaluations/page.tsx`:
- Around line 625-649: Replace the click-only <span> trigger with a semantic
<button type="button"> (keeping the same classes/styles) and wire its
interactions to toggle and position the helper in addition to click: keep using
getBoundingClientRect to setDuplicationInfoPos and call setShowDuplicationInfo,
but also open the helper on onFocus and onMouseEnter and close it on onBlur and
onMouseLeave (and support Escape via onKeyDown to call
setShowDuplicationInfo(false)); ensure all handlers still call
e.stopPropagation()/e.preventDefault() where appropriate and preserve use of
showDuplicationInfo, setShowDuplicationInfo, duplicationInfoPos, and
setDuplicationInfoPos so keyboard users and hover users can discover the
Duplication Factor helper.

In `@app/text-to-speech/page.tsx`:
- Around line 1514-1556: The interactive "i" trigger is a non-semantic <span>
which blocks keyboard access and hover/focus triggers; replace that <span> with
a <button type="button"> (preserve className and inline styles) and add an
aria-label (e.g., "Speech Naturalness help"); keep the existing onClick handler
but move e.stopPropagation() into the button handler, and add onFocus/onBlur and
onMouseEnter/onMouseLeave handlers that call setScoreInfoPos(scoreInfoPos
calculation) and toggle setOpenScoreInfo('speech_naturalness') so the popover
opens on keyboard focus and hover as well; ensure the popover still uses
openScoreInfo, scoreInfoPos and onClick={(e)=>e.stopPropagation()} unchanged.
Apply the same change to the other tooltip instance referenced (lines around the
second occurrence).

---

Duplicate comments:
In `@app/evaluations/page.tsx`:
- Around line 259-264: handleRunEvaluation creates a new evaluation but never
refreshes the evalJobs list, so the new run doesn't appear; update the success
path in handleRunEvaluation to refresh the runs by calling the existing
fetchEvaluations function (or a passed-in refresh callback) after the POST
completes and before/after setIsEvaluating(false) and toast.success. Locate the
success branch in handleRunEvaluation (where response.json() is parsed and
toast.success is called) and invoke fetchEvaluations() or the provided refresh
callback to update the evalJobs state so the right-pane list shows the new run
immediately.
- Around line 123-129: The CSV parsing currently treats a leading UTF-8 BOM as
part of the first header (see firstLine and headers variables), causing valid
files like "question,answer" to be rejected; before checking/using firstLine and
before splitting into headers, strip any leading BOM (U+FEFF) from the string
(e.g., remove /^\uFEFF/ from firstLine), then continue trimming and lowercasing
header tokens and keep the existing empty-file handling that clears
event.target.value. Ensure the BOM-stripped value is used for both the empty
check and the headers calculation.
- Around line 508-530: The current approach splits csvText by '\n' into lines
before calling parseRow, which breaks quoted multiline cells; change the logic
to parse csvText as a stream and build rows while respecting quotes instead of
pre-splitting: replace the lines/parseRow usage with a single pass over csvText
that accumulates characters into a current field and current row, toggles
inQuotes on '"' (handling '""' escapes), only ends a field on unquoted commas
and ends a row on unquoted newlines, then derive headers from the first
completed row and rows from subsequent completed rows; update references to
parseRow, lines, headers, and rows accordingly.
- Around line 1019-1020: The current run gate uses selectedDatasetId which can
be stale; change the gating and submission to depend on the resolved
selectedDataset object instead. Update the boolean canRun to require
selectedDataset (e.g., replace selectedDatasetId in the expression with
selectedDataset) and ensure any code that POSTs the dataset uses
selectedDataset.dataset_id (or the resolved id property) rather than the raw
selectedDatasetId so deleted/stale ids are not submitted.
- Around line 692-703: The icon-only buttons (e.g., the delete button that calls
onRemoveFile and clears fileInputRef.value) lack accessible names; update these
button elements to include descriptive aria-label attributes (for example
aria-label="Remove CSV" for the onRemoveFile button) and do the same for other
icon-only controls mentioned (close dataset viewer, refresh evaluations) so
screen readers announce their purpose; ensure aria-labels match the button
action and are added to the button elements wrapping the SVGs (and any similar
buttons at the other ranges).

In `@app/speech-to-text/page.tsx`:
- Around line 1048-1050: The dataset viewer currently hard-codes fetching the
first 100 samples via the fetch call that includes
include_samples=true&sample_limit=100&sample_offset=0 (using datasetId and
apiKeys[0].key); update the component to support pagination or an explicit
partial-scope indicator: introduce state for sampleLimit and sampleOffset (or a
nextPage token), replace the hard-coded query params in the fetch with those
state values, add UI controls (Load more / page controls) that update
sampleOffset and re-fetch, and/or show a clear badge/message in the modal
indicating "Showing first 100 samples" when you opt not to paginate; ensure the
fetch call and its surrounding handler (the code that constructs the URL and
processes response) reference the new state so additional pages load correctly.
- Around line 1165-1177: Replace the clickable <span> popover triggers with
semantic buttons: change the span to <button type="button"> (preserve className,
inline style, and existing onClick logic including
stopPropagation/preventDefault if still needed) and add a descriptive aria-label
(e.g., "Open language help"). Also wire up keyboard and hover/focus interactions
by calling setLanguageInfoPos(...) and setShowLanguageInfo(true) from onFocus
and onMouseEnter, and closing the popover on onBlur and onMouseLeave
(setShowLanguageInfo(false)); apply the same replacement and handlers for the
corresponding score-metric help instance (the other block around the
setShowLanguageInfo / setLanguageInfoPos usage at ~1981-1994).

In `@app/text-to-speech/page.tsx`:
- Around line 1230-1257: The updateFeedback function is vulnerable to
out-of-order PATCH responses overwriting newer optimistic updates; implement
per-result serialization by tracking a per-result pending update token or
promise queue keyed by resultId (e.g., a Map<number, Promise> or Map<number,
number> version counter) and use it inside updateFeedback to ensure each PATCH
is sent and applied in order — only apply the response/update to setResults for
a result when the token/version matches the latest for that resultId; keep
references to updateFeedback, resultId, setResults and results when wiring the
token/queue logic so older responses are ignored.
- Around line 815-837: The code currently uses csvText.split('\n') which breaks
quoted multiline CSV fields; instead implement a splitIntoRecords function that
scans csvText character-by-character, toggles inQuotes on double quotes, and
only treats newline characters as record separators when not inQuotes, then feed
those records into the existing parseRow; replace the lines assignment and
subsequent headers/rows usage to use splitIntoRecords(csvText) so parseRow
continues to parse fields but no longer receives broken multiline cells.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 31439088-8dba-4a92-9d12-5eb51b91b779

📥 Commits

Reviewing files that changed from the base of the PR and between f5c2fa1 and 1bc7582.

📒 Files selected for processing (3)

app/evaluations/page.tsx
app/speech-to-text/page.tsx
app/text-to-speech/page.tsx

coderabbitai · 2026-03-17T07:06:56Z

app/evaluations/page.tsx

+                <span
+                  className="inline-flex items-center justify-center w-3.5 h-3.5 rounded-full text-[9px] font-normal cursor-pointer shrink-0"
+                  style={{ backgroundColor: colors.bg.primary, border: `1px solid ${colors.border}`, color: colors.text.secondary }}
+                  onClick={(e) => {
+                    e.stopPropagation();
+                    e.preventDefault();
+                    const rect = e.currentTarget.getBoundingClientRect();
+                    setDuplicationInfoPos({ top: rect.bottom + 4, left: rect.left });
+                    setShowDuplicationInfo(!showDuplicationInfo);
+                  }}
+                >
+                  i
+                </span>
+                {showDuplicationInfo && (
+                  <div
+                    className="fixed z-50 rounded-lg shadow-lg border text-xs p-3"
+                    style={{ backgroundColor: colors.bg.primary, borderColor: colors.border, width: '280px', top: duplicationInfoPos.top, left: duplicationInfoPos.left }}
+                    onClick={(e) => e.stopPropagation()}
+                  >
+                    <div className="font-semibold mb-1" style={{ color: colors.text.primary }}>Duplication Factor</div>
+                    <p style={{ color: colors.text.secondary, lineHeight: '1.5' }}>
+                      Controls how many times each question is sent to the AI to generate an answer. For example, setting this to 3 means the AI answers each question 3 separate times — helpful for checking if the AI gives consistent and reliable responses each time.
+                    </p>
+                  </div>
+                )}


⚠️ Potential issue | 🟡 Minor

Expose the help tip on a semantic trigger, not a click-only <span>.

This affordance is easy to miss and is not keyboard-accessible in its current form. Converting it to a <button type="button"> and opening on focus/hover as well as click will make the helper discoverable without changing the layout.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/evaluations/page.tsx` around lines 625 - 649, Replace the click-only <span> trigger with a semantic <button type="button"> (keeping the same classes/styles) and wire its interactions to toggle and position the helper in addition to click: keep using getBoundingClientRect to setDuplicationInfoPos and call setShowDuplicationInfo, but also open the helper on onFocus and onMouseEnter and close it on onBlur and onMouseLeave (and support Escape via onKeyDown to call setShowDuplicationInfo(false)); ensure all handlers still call e.stopPropagation()/e.preventDefault() where appropriate and preserve use of showDuplicationInfo, setShowDuplicationInfo, duplicationInfoPos, and setDuplicationInfoPos so keyboard users and hover users can discover the Duplication Factor helper.

coderabbitai · 2026-03-17T07:06:56Z

app/text-to-speech/page.tsx

+                          <span
+                            className="inline-flex items-center justify-center w-3.5 h-3.5 rounded-full text-[9px] font-normal cursor-pointer align-middle"
+                            style={{ backgroundColor: colors.bg.primary, border: `1px solid ${colors.border}`, color: colors.text.secondary }}
+                            onClick={(e) => {
+                              e.stopPropagation();
+                              const rect = e.currentTarget.getBoundingClientRect();
+                              setScoreInfoPos({ top: rect.bottom + 4, left: rect.left });
+                              setOpenScoreInfo(openScoreInfo === 'speech_naturalness' ? null : 'speech_naturalness');
+                            }}
+                          >
+                            i
+                          </span>
+                          {openScoreInfo === 'speech_naturalness' && (
+                            <div
+                              className="fixed z-50 rounded-lg shadow-lg border text-xs"
+                              style={{ backgroundColor: colors.bg.primary, borderColor: colors.border, width: '340px', top: scoreInfoPos.top, left: scoreInfoPos.left }}
+                              onClick={(e) => e.stopPropagation()}
+                            >
+                              <div className="p-3">
+                                <div className="font-semibold mb-2" style={{ color: colors.text.primary }}>Speech Naturalness</div>
+                                <p className="mb-3" style={{ color: colors.text.secondary, fontFamily: 'system-ui, sans-serif' }}>
+                                  Assesses how human-like the generated speech sounds.
+                                </p>
+                                <div className="mb-1 font-semibold" style={{ color: colors.text.primary }}>Scoring</div>
+                                <div className="space-y-2 p-2 rounded" style={{ backgroundColor: colors.bg.secondary }}>
+                                  <div className="flex">
+                                    <span className="font-semibold shrink-0" style={{ color: colors.status.success, width: '62px' }}>High:</span>
+                                    <span style={{ color: colors.text.primary }}>Very human-like, natural flow with appropriate pauses and inflections.</span>
+                                  </div>
+                                  <div className="flex">
+                                    <span className="font-semibold shrink-0" style={{ color: '#ca8a04', width: '62px' }}>Medium:</span>
+                                    <span style={{ color: colors.text.primary }}>Some human qualities but with occasional robotic or awkward elements.</span>
+                                  </div>
+                                  <div className="flex">
+                                    <span className="font-semibold shrink-0" style={{ color: colors.status.error, width: '62px' }}>Low:</span>
+                                    <span style={{ color: colors.text.primary }}>Clearly robotic or artificial, with choppy or monotone speech.</span>
+                                  </div>
+                                </div>
+                                <div className="mt-2 font-semibold" style={{ color: colors.status.success }}>Higher is better.</div>
+                              </div>
+                            </div>
+                          )}
+                          </div>


⚠️ Potential issue | 🟡 Minor

Use semantic buttons for the score help popovers and don’t make them click-only.

These tooltip triggers are currently interactive <span>s, so they are not keyboard reachable, and only opening on click makes the help easy to miss. A <button type="button"> with an aria-label, plus focus/hover handling, would address both problems.

Also applies to: 1563-1605

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/text-to-speech/page.tsx` around lines 1514 - 1556, The interactive "i" trigger is a non-semantic <span> which blocks keyboard access and hover/focus triggers; replace that <span> with a <button type="button"> (preserve className and inline styles) and add an aria-label (e.g., "Speech Naturalness help"); keep the existing onClick handler but move e.stopPropagation() into the button handler, and add onFocus/onBlur and onMouseEnter/onMouseLeave handlers that call setScoreInfoPos(scoreInfoPos calculation) and toggle setOpenScoreInfo('speech_naturalness') so the popover opens on keyboard focus and hover as well; ensure the popover still uses openScoreInfo, scoreInfoPos and onClick={(e)=>e.stopPropagation()} unchanged. Apply the same change to the other tooltip instance referenced (lines around the second occurrence).

Ayush8923

I’ve added a few comments which are mostly small nitpicks and can be quickly addressed.

Since this is a large PR, there are a few areas where we can improve the component structure and update some pages. To keep this PR focused, we can track those improvements in this issue: #71

If the functionality is working as expected, we can proceed with merging this PR once the nitpicks are addressed.

app/components/DetailedResultsTable.tsx

app/components/ScoreDisplay.tsx

app/datasets/page.tsx

Ayush8923 · 2026-03-17T11:48:59Z

app/text-to-speech/page.tsx

+            tabs={[
+              { id: 'datasets', label: 'Datasets' },
+              { id: 'evaluations', label: 'Evaluations' },
+            ]}


Instead of hardcoding the tabs here, we can move them into a constant. This will make it easier to manage and update in the future. If we need to add a new tab, we’ll only need to update the constant rather than modifying it in multiple places.

Also, we can maintain separate constants for tabs across different modules if needed.

AkhileshNegi added 30 commits March 8, 2026 00:57

first stab at tts eval screen

2f91445

Merge branch 'main' into feature/tts-evaluation

dd84333

added playable audio in TTS screen

57b3eee

updated message

ebc764c

consistent UI for text evaluation

22309ad

cleanups

310bd82

coderabbit suggestions

4565b70

Merge branch 'main' into feature/tts-evaluation

dbea04e

cleanup language field

9289305

cleanups

0be3a5b

cleanups in text evaluation screen

30ff1b3

coderabbit suggestions

f32ad58

first stab at TTS metric

ca4fab4

first stab at overhaul

2224592

refactor text evaluation screen

dd143f4

evaluation score overhaul

b129dad

eval screen revamp

21a39a9

updated TTS/STT cards

5699216

API Key required cleanup

a92beb7

added datasets

d3cf8e3

cleanups

0a505a3

added support for updating samples

cacdd2e

updated TTS feedback results

0b5cb86

update TTS sample

04074f8

moving to dropdown

65d91f7

cleanup cards

e8551c9

consistent cars

e116cb3

cleanup STT headings

383f8a9

consistent UI in STT, TTS main screen

20e3619

added run evaluation text for all screens

d130865

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

conflict fixes

51a8f83

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

app/evaluations/[id]/page.tsx Show resolved Hide resolved

app/evaluations/[id]/page.tsx Outdated Show resolved Hide resolved

reorder sidebar

f5c2fa1

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

added dropdown

1bc7582

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

coderabbit suggestions

a58e538

AkhileshNegi self-assigned this Mar 17, 2026

AkhileshNegi added the enhancement New feature or request label Mar 17, 2026

AkhileshNegi added 2 commits March 17, 2026 12:49

code cleanups

9cbe5e0

making error as toast

d59a9d5

AkhileshNegi requested review from Ayush8923, Prajna1999, nishika26 and vprashrex March 17, 2026 07:35

AkhileshNegi linked an issue Mar 17, 2026 that may be closed by this pull request

Evaluation UI: Update Results screen #6

Closed

Ayush8923 reviewed Mar 17, 2026

View reviewed changes

AkhileshNegi added 3 commits March 17, 2026 23:48

move to in-line style

cfd4797

making info as hover

5e6e294

Merge branch 'main' into enhancement/ui-overhaul

d71e51a

AkhileshNegi merged commit 31b0489 into main Mar 18, 2026
1 check was pending

This was referenced Mar 18, 2026

Fix(security): remove NEXT_PUBLIC_ usage from server-only route handlers #75

Merged

Guardrails UI: add validators to config #72

Open

Ayush8923 deleted the enhancement/ui-overhaul branch March 20, 2026 11:35

This was referenced Mar 23, 2026

Minimize API Calls Across Config/Evaluation Flows #83

Merged

CI: Build Lint Checks #85

Merged

API management centralize store #89

Merged

Phase:1- Refactoring #90

Merged

Lazy-load config versions on expand & extract inline SVGs to icon components #96

Merged

Conversation

AkhileshNegi commented Mar 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Ayush8923 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ayush8923 Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AkhileshNegi commented Mar 16, 2026 •

edited by coderabbitai bot

Loading