Skip to content

Evals UI: fixing score and config display#56

Merged
nishika26 merged 7 commits intomainfrom
bug/evals_score
Mar 10, 2026
Merged

Evals UI: fixing score and config display#56
nishika26 merged 7 commits intomainfrom
bug/evals_score

Conversation

@nishika26
Copy link
Copy Markdown
Contributor

@nishika26 nishika26 commented Mar 8, 2026

Target issue is #51

Summary by CodeRabbit

  • New Features

    • Real-time config search/filter with autofocus and clear empty-state messaging.
    • Resync action in Metrics Overview (Resyncing... state) with a No Traces modal when applicable.
    • Knowledge Base IDs surfaced across config views, tool sections, and evaluation details; header shows name/version.
    • Tooltip for Max Results control.
    • Improved score presentation: summary scores supported across formats, grouped vs. row results, and clearer “summary-only” messaging.
  • Bug Fixes

    • Normalized timestamp handling to assume UTC when timezone is missing.
  • Style

    • Improved text color contrast; label text changed from “Vector Store” to “Knowledge Base.”

@nishika26 nishika26 self-assigned this Mar 8, 2026
@nishika26 nishika26 added the enhancement New feature or request label Mar 8, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 8, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds resync_score passthrough to the evaluations API and a frontend resync workflow/UI; surfaces and deduplicates knowledge_base_ids across config UIs; adds searchable config selector; expands score handling to accept summary-only score objects; adds Max Results tooltip and normalizes string timestamps to UTC.

Changes

Cohort / File(s) Summary
Resync API & Page
app/api/evaluations/[id]/route.ts, app/evaluations/[id]/page.tsx
Reads resync_score from requests and forwards it to backend; adds frontend handleResync flow, Resync button, loading/error handling, conditional post-resync fetches, and a No Traces modal.
Knowledge Base IDs (KB) surface
app/components/ConfigModal.tsx, app/components/ConfigCard.tsx, app/components/ConfigDrawer.tsx, app/components/.../CurrentConfigTab.tsx, app/components/DiffView.tsx, app/evaluations/[id]/page.tsx
Adds knowledge_base_ids to config version info, deduplicates IDs from params and tools, renders KB badges/sections, and renames Vector Store labels to Knowledge Base across UI components.
Config Selection Search UX
app/components/ConfigSelector.tsx
Adds searchQuery with live filtering, two-state dropdown (search input vs button), autofocus, empty-result messaging, and clears query on close/select.
Score types, normalization & displays
app/components/types.ts, app/components/ScoreDisplay.tsx, app/components/DetailedResultsTable.tsx, app/evaluations/[id]/page.tsx
Introduces BasicScoreObject, updates ScoreObject union and guards (hasSummaryScores, isBasicScoreObject), updates normalizeToIndividualScores, and replaces prior new-format guards with hasSummaryScores to support summary-only score payloads.
Prompt editor tooltip & styling
app/components/prompt-editor/ConfigEditorPane.tsx, app/components/prompt-editor/CurrentConfigTab.tsx
Adds showTooltip state and a hover-triggered tooltip for Max Results, renames Vector Store ID → Knowledge Base ID, and adjusts label/input color styling.
Time formatting utility
app/lib/useConfigs.ts
Normalizes string timestamps by appending Z when timezone is missing (assume UTC) before computing relative times.
Minor imports & UI text updates
app/components/ScoreDisplay.tsx, app/components/DetailedResultsTable.tsx, app/components/types.ts, small label changes across components
Replaces old format guards/imports with hasSummaryScores, adjusts rendering branches, and updates multiple UI labels from "Vector Store" to "Knowledge Base".

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Frontend as Frontend (page.tsx)
    participant APIRoute as API Route (route.ts)
    participant Backend as Backend Server

    User->>Frontend: Click "Resync" button
    activate Frontend
    Frontend->>Frontend: set isResyncing = true
    Frontend->>APIRoute: GET /api/evaluations/:id?resync_score=true
    deactivate Frontend

    activate APIRoute
    APIRoute->>Backend: Forward request with resync_score param
    deactivate APIRoute

    activate Backend
    Backend->>Backend: Recalculate/resync scores
    Backend-->>APIRoute: Return updated evaluation payload
    deactivate Backend

    activate APIRoute
    APIRoute-->>Frontend: Return evaluation payload
    deactivate APIRoute

    activate Frontend
    Frontend->>Frontend: Update job/evaluation state
    Frontend->>Frontend: Optionally fetch assistant config & config version
    Frontend->>Frontend: Show success/error toast
    Frontend->>Frontend: set isResyncing = false
    deactivate Frontend

    Frontend-->>User: Display updated metrics and KB info
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • vprashrex
  • Prajna1999
  • AkhileshNegi

Poem

🐰 I hopped through configs, KBs in my paw,
I carried a resync to tidy the law,
Search sniffed each group, tooltips gave hints,
Timestamps found midnight, summary scores made sense,
Metrics refreshed — hop, done! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.65% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: fixes to score display logic (switching from isNewScoreObject to hasSummaryScores, adding resync functionality) and config display updates (renaming Vector Store to Knowledge Base, adding knowledge_base_ids display).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bug/evals_score

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
app/components/ConfigModal.tsx (1)

266-277: Minor: Consider consistent terminology.

The top-level section uses "Knowledge Base IDs" (line 208) while per-tool sections use "Vector Store IDs" (line 269). If these refer to the same concept, consider using consistent terminology to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/ConfigModal.tsx` around lines 266 - 277, The per-tool header
currently renders "Vector Store IDs ({tool.type})" while the top-level section
uses "Knowledge Base IDs"; update the per-tool header string to match the
top-level terminology for consistency. Locate the JSX that references
tool.knowledge_base_ids and change the visible label text in the div containing
"Vector Store IDs ({tool.type})" to "Knowledge Base IDs ({tool.type})" (or vice
versa if you prefer the other term), and ensure any other occurrences of "Vector
Store IDs" in this component are updated the same way.
app/evaluations/[id]/page.tsx (1)

391-401: Consider extracting shared fetch logic.

The code for fetching assistant config and config info (lines 394-401) duplicates logic from fetchJobDetails (lines 105-112). Consider extracting to a shared helper to reduce duplication:

const refreshRelatedData = async (job: EvalJob, apiKey: string) => {
  if (job.assistant_id) {
    fetchAssistantConfig(job.assistant_id, apiKey);
  }
  if (job.config_id && job.config_version) {
    fetchConfigInfo(job.config_id, job.config_version, apiKey);
  }
};

Then call refreshRelatedData(foundJob, selectedKey.key) in both places.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx around lines 391 - 401, Extract the duplicated
fetch logic into a helper (e.g., refreshRelatedData) that accepts an EvalJob and
apiKey and calls fetchAssistantConfig(job.assistant_id, apiKey) and
fetchConfigInfo(job.config_id, job.config_version, apiKey) when present; then
replace the duplicated blocks in both the post-setJob section (where foundJob is
used) and inside fetchJobDetails with a single call refreshRelatedData(foundJob,
selectedKey.key) to eliminate duplication and keep behavior identical.
app/components/ScoreDisplay.tsx (1)

85-91: Prefer stable key over array index.

Using array index as key (key={idx}) can cause rendering issues if the list is reordered. Since summary.name should be unique within summary scores, use it as the key:

-        {numericScores.map((summary, idx) => {
+        {numericScores.map((summary) => {
           const value = summary.avg !== undefined ? summary.avg.toFixed(2) : 'N/A';
           const std = summary.std !== undefined ? summary.std.toFixed(2) : null;

           return (
             <div
-              key={idx}
+              key={summary.name}
               className="inline-flex items-center gap-1.5 px-4 py-2 rounded-lg text-sm"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/ScoreDisplay.tsx` around lines 85 - 91, The list rendering in
ScoreDisplay.tsx uses numericScores.map with key={idx}, which is unstable;
change the key to use the unique identifier summary.name in the returned element
(i.e., replace key={idx} with key derived from summary.name) and, if
summary.name may be missing, fall back to a stable combined key such as
`${summary.name}-${idx}` to ensure uniqueness while avoiding plain array
indices.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/components/ScoreDisplay.tsx`:
- Around line 39-40: Replace the weak runtime check "'summary_scores' in score"
in ScoreDisplay.tsx with the proper type guards used elsewhere: import and use
isNewScoreObject, isNewScoreObjectV2, and isLegacyScoreObject to validate the
union shapes and distinguish V1 vs V2 formats; specifically, call
isNewScoreObjectV2(score) to detect V2 (traces), isNewScoreObject(score) for V1
(individual_scores), and fallback to isLegacyScoreObject where appropriate,
matching the logic in DetailedResultsTable.tsx so malformed objects with only
summary_scores are rejected and format distinctions are preserved.

---

Nitpick comments:
In `@app/components/ConfigModal.tsx`:
- Around line 266-277: The per-tool header currently renders "Vector Store IDs
({tool.type})" while the top-level section uses "Knowledge Base IDs"; update the
per-tool header string to match the top-level terminology for consistency.
Locate the JSX that references tool.knowledge_base_ids and change the visible
label text in the div containing "Vector Store IDs ({tool.type})" to "Knowledge
Base IDs ({tool.type})" (or vice versa if you prefer the other term), and ensure
any other occurrences of "Vector Store IDs" in this component are updated the
same way.

In `@app/components/ScoreDisplay.tsx`:
- Around line 85-91: The list rendering in ScoreDisplay.tsx uses
numericScores.map with key={idx}, which is unstable; change the key to use the
unique identifier summary.name in the returned element (i.e., replace key={idx}
with key derived from summary.name) and, if summary.name may be missing, fall
back to a stable combined key such as `${summary.name}-${idx}` to ensure
uniqueness while avoiding plain array indices.

In `@app/evaluations/`[id]/page.tsx:
- Around line 391-401: Extract the duplicated fetch logic into a helper (e.g.,
refreshRelatedData) that accepts an EvalJob and apiKey and calls
fetchAssistantConfig(job.assistant_id, apiKey) and
fetchConfigInfo(job.config_id, job.config_version, apiKey) when present; then
replace the duplicated blocks in both the post-setJob section (where foundJob is
used) and inside fetchJobDetails with a single call refreshRelatedData(foundJob,
selectedKey.key) to eliminate duplication and keep behavior identical.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ff683d93-5752-4182-bc00-87baa8b3f8f7

📥 Commits

Reviewing files that changed from the base of the PR and between e6b0b20 and 464887c.

📒 Files selected for processing (7)
  • app/api/evaluations/[id]/route.ts
  • app/components/ConfigModal.tsx
  • app/components/ConfigSelector.tsx
  • app/components/ScoreDisplay.tsx
  • app/components/prompt-editor/ConfigEditorPane.tsx
  • app/evaluations/[id]/page.tsx
  • app/lib/useConfigs.ts

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/components/types.ts`:
- Line 93: The ScoreObject union was widened to include BasicScoreObject but
callers still only check isNewScoreObject()/isNewScoreObjectV2(), so
BasicScoreObject falls through; add and export a single type guard
hasSummaryScores(obj: ScoreObject): obj is NewScoreObject | NewScoreObjectV2 |
BasicScoreObject that checks for the presence/shape of summary_scores, use it in
place of manual isNewScoreObject() || isNewScoreObjectV2() chains (e.g., replace
the isNewFormat logic in the code that calls isNewScoreObject/isNewScoreObjectV2
and the checks in DetailedResultsTable) so all callers uniformly detect objects
that contain summary_scores before widening the union.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 71230f28-ccd9-4db9-a860-2c152ff75384

📥 Commits

Reviewing files that changed from the base of the PR and between 464887c and 067ea27.

📒 Files selected for processing (2)
  • app/components/ScoreDisplay.tsx
  • app/components/types.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/components/ScoreDisplay.tsx

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/evaluations/[id]/page.tsx (1)

65-119: ⚠️ Potential issue | 🟡 Minor

Missing toast in useCallback dependencies.

The fetchJobDetails callback uses toast.error (line 91) but toast is not included in the dependency array (line 119). While useToast likely returns a stable reference, this could cause stale closure issues if the toast context changes.

🔧 Suggested fix
-  }, [apiKeys, selectedKeyId, jobId, exportFormat]);
+  }, [apiKeys, selectedKeyId, jobId, exportFormat, toast]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx around lines 65 - 119, The fetchJobDetails
useCallback closure references toast.error but toast is not included in its
dependency array; update the dependency array for fetchJobDetails (the
useCallback that defines fetchJobDetails) to include the toast reference
returned by useToast (alongside apiKeys, selectedKeyId, jobId, exportFormat) so
the callback sees the current toast instance; if useToast returns a stable ref,
you can also assert stability, but the immediate fix is to add toast to the
dependency array of fetchJobDetails.
🧹 Nitpick comments (5)
app/components/DetailedResultsTable.tsx (1)

33-33: Redundant type cast after type guard.

The as GroupedTraceItem[] cast is unnecessary because isGroupedFormat() is a type guard (traces is GroupedTraceItem[]) that already narrows the type.

🔧 Remove redundant cast
-      return <GroupedResultsTable traces={scoreObject.traces as GroupedTraceItem[]} />;
+      return <GroupedResultsTable traces={scoreObject.traces} />;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/DetailedResultsTable.tsx` at line 33, The return contains a
redundant cast: remove the unnecessary "as GroupedTraceItem[]" on
scoreObject.traces because the isGroupedFormat() type guard already narrows
traces to GroupedTraceItem[]; update the JSX return in DetailedResultsTable to
simply pass scoreObject.traces to GroupedResultsTable (no cast) and ensure the
isGroupedFormat(check) call remains the guard before this return so TypeScript
still recognizes the narrowed type.
app/components/ScoreDisplay.tsx (1)

85-107: Consider using summary.name as the React key instead of index.

Using array index as a key (key={idx}) works but is less robust if the list could be reordered or filtered in the future. Since summary.name should be unique within summaryScores, it would be a more stable key.

🔧 Suggested change
-        {numericScores.map((summary, idx) => {
+        {numericScores.map((summary) => {
           const value = summary.avg !== undefined ? summary.avg.toFixed(2) : 'N/A';
           const std = summary.std !== undefined ? summary.std.toFixed(2) : null;

           return (
             <div
-              key={idx}
+              key={summary.name}
               className="inline-flex items-center gap-1.5 px-4 py-2 rounded-lg text-sm"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/ScoreDisplay.tsx` around lines 85 - 107, Replace the use of
array index as the React key in the numericScores mapping with a stable unique
identifier by using summary.name instead of idx; update the JSX where
numericScores.map((summary, idx) => { ... key={idx} ... }) to key={summary.name}
(ensuring summary.name is unique/defined) so components are keyed consistently
if the list is reordered or filtered.
app/components/types.ts (1)

140-151: Simplify redundant logic in hasSummaryScores.

The internal if-statement checking for 'traces' in score adds no value since both branches return true. After confirming 'summary_scores' in score, the function can directly return true.

♻️ Simplified implementation
 export function hasSummaryScores(score: ScoreObject | null | undefined): score is NewScoreObjectV2 | BasicScoreObject {
   if (!score) return false;
-  if (!('summary_scores' in score)) return false;
-
-  // Prioritize traces format if available
-  if ('traces' in score) {
-    return true; 
-  }
-
-  // Otherwise, it's BasicScoreObject (summary_scores only, no traces, no individual_scores)
-  return true;
+  return 'summary_scores' in score;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/types.ts` around lines 140 - 151, The function
hasSummaryScores contains a redundant conditional: after confirming the input is
non-null and that 'summary_scores' in score, it checks 'traces' in score but
both branches return true; simplify by removing the 'traces' check and directly
return true once 'summary_scores' exists. Update the implementation in
hasSummaryScores (referencing ScoreObject, NewScoreObjectV2, BasicScoreObject)
to remove the unnecessary branch and return true after the summary_scores guard.
app/evaluations/[id]/page.tsx (2)

362-410: Consider extracting shared fetch logic to reduce duplication.

handleResync duplicates substantial logic from fetchJobDetails (lines 65-119): both fetch evaluation data, set the job state, and conditionally fetch assistant config and config info. This violates DRY and increases maintenance burden.

♻️ Suggested approach

Extract the shared logic into a reusable helper:

const processEvaluationResponse = async (data: any, apiKey: string) => {
  const foundJob = data.data || data;
  if (!foundJob) {
    throw new Error('Evaluation job not found');
  }
  setJob(foundJob);

  if (foundJob.assistant_id) {
    fetchAssistantConfig(foundJob.assistant_id, apiKey);
  }
  if (foundJob.config_id && foundJob.config_version) {
    fetchConfigInfo(foundJob.config_id, foundJob.config_version, apiKey);
  }
  return foundJob;
};

Then both fetchJobDetails and handleResync can call this helper after their respective fetch operations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx around lines 362 - 410, handleResync
duplicates the same response-processing steps found in fetchJobDetails; extract
that shared logic into a helper (e.g., processEvaluationResponse) that accepts
the parsed response data and apiKey, validates/normalizes foundJob (data.data ||
data), calls setJob(foundJob), and conditionally invokes
fetchAssistantConfig(foundJob.assistant_id, apiKey) and
fetchConfigInfo(foundJob.config_id, foundJob.config_version, apiKey) before
returning the job; then update both fetchJobDetails and handleResync to call
this helper after their fetch/response.json() to remove duplication.

566-575: Knowledge base IDs may contain duplicates.

The current logic aggregates knowledge_base_ids from all tools using flatMap, which could result in duplicate IDs being displayed if multiple tools share the same knowledge base.

🔧 Deduplicate knowledge base IDs
 {configVersionInfo.tools && configVersionInfo.tools.length > 0 && (() => {
-  const knowledgeBaseIds = configVersionInfo.tools
+  const knowledgeBaseIds = [...new Set(configVersionInfo.tools
     .filter(tool => Array.isArray(tool.knowledge_base_ids) && tool.knowledge_base_ids.length > 0)
-    .flatMap(tool => tool.knowledge_base_ids);
+    .flatMap(tool => tool.knowledge_base_ids))];
   return knowledgeBaseIds.length > 0 ? (
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx around lines 566 - 575, The displayed KB IDs
can contain duplicates because the current IIFE collects all
tool.knowledge_base_ids via flatMap; update that block to deduplicate the
resulting knowledgeBaseIds (e.g., convert to a Set or use Array.from(new
Set(...)) before checking length and joining) so the span for KB: uses unique
IDs only; keep the surrounding conditional (configVersionInfo.tools &&
configVersionInfo.tools.length > 0) and the same rendering logic but use the
deduplicated array when computing knowledgeBaseIds.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/components/types.ts`:
- Around line 185-201: The fallback in the score.traces.map handler returns an
incomplete IndividualScore when a GroupedTraceItem slips through; update the
fallback (the map over score.traces handling TraceItem | GroupedTraceItem) to
return a full IndividualScore shape (populate trace_id, input, output, metadata,
trace_scores) instead of just trace_id/trace_scores and emit an explicit warning
(e.g., console.warn or a logger) including the offending trace so consumers
won’t break; alternatively you may throw an error to fail-fast — pick one
approach and implement it in the map’s fallback branch that currently checks for
'llm_answer' (referencing TraceItem, GroupedTraceItem, IndividualScore, and the
score.traces mapping).

---

Outside diff comments:
In `@app/evaluations/`[id]/page.tsx:
- Around line 65-119: The fetchJobDetails useCallback closure references
toast.error but toast is not included in its dependency array; update the
dependency array for fetchJobDetails (the useCallback that defines
fetchJobDetails) to include the toast reference returned by useToast (alongside
apiKeys, selectedKeyId, jobId, exportFormat) so the callback sees the current
toast instance; if useToast returns a stable ref, you can also assert stability,
but the immediate fix is to add toast to the dependency array of
fetchJobDetails.

---

Nitpick comments:
In `@app/components/DetailedResultsTable.tsx`:
- Line 33: The return contains a redundant cast: remove the unnecessary "as
GroupedTraceItem[]" on scoreObject.traces because the isGroupedFormat() type
guard already narrows traces to GroupedTraceItem[]; update the JSX return in
DetailedResultsTable to simply pass scoreObject.traces to GroupedResultsTable
(no cast) and ensure the isGroupedFormat(check) call remains the guard before
this return so TypeScript still recognizes the narrowed type.

In `@app/components/ScoreDisplay.tsx`:
- Around line 85-107: Replace the use of array index as the React key in the
numericScores mapping with a stable unique identifier by using summary.name
instead of idx; update the JSX where numericScores.map((summary, idx) => { ...
key={idx} ... }) to key={summary.name} (ensuring summary.name is unique/defined)
so components are keyed consistently if the list is reordered or filtered.

In `@app/components/types.ts`:
- Around line 140-151: The function hasSummaryScores contains a redundant
conditional: after confirming the input is non-null and that 'summary_scores' in
score, it checks 'traces' in score but both branches return true; simplify by
removing the 'traces' check and directly return true once 'summary_scores'
exists. Update the implementation in hasSummaryScores (referencing ScoreObject,
NewScoreObjectV2, BasicScoreObject) to remove the unnecessary branch and return
true after the summary_scores guard.

In `@app/evaluations/`[id]/page.tsx:
- Around line 362-410: handleResync duplicates the same response-processing
steps found in fetchJobDetails; extract that shared logic into a helper (e.g.,
processEvaluationResponse) that accepts the parsed response data and apiKey,
validates/normalizes foundJob (data.data || data), calls setJob(foundJob), and
conditionally invokes fetchAssistantConfig(foundJob.assistant_id, apiKey) and
fetchConfigInfo(foundJob.config_id, foundJob.config_version, apiKey) before
returning the job; then update both fetchJobDetails and handleResync to call
this helper after their fetch/response.json() to remove duplication.
- Around line 566-575: The displayed KB IDs can contain duplicates because the
current IIFE collects all tool.knowledge_base_ids via flatMap; update that block
to deduplicate the resulting knowledgeBaseIds (e.g., convert to a Set or use
Array.from(new Set(...)) before checking length and joining) so the span for KB:
uses unique IDs only; keep the surrounding conditional (configVersionInfo.tools
&& configVersionInfo.tools.length > 0) and the same rendering logic but use the
deduplicated array when computing knowledgeBaseIds.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6fd2ef5e-5d70-46a5-80b8-fbd6818fc119

📥 Commits

Reviewing files that changed from the base of the PR and between 067ea27 and 99a4610.

📒 Files selected for processing (4)
  • app/components/DetailedResultsTable.tsx
  • app/components/ScoreDisplay.tsx
  • app/components/types.ts
  • app/evaluations/[id]/page.tsx

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
app/components/ConfigModal.tsx (2)

268-269: Inconsistent label color styling.

These labels use color: '#000000' (pure black) while other section labels in the same modal use color: '#737373' (muted gray). This creates visual inconsistency.

🎨 Suggested fix for consistent styling
-                      <div className="text-xs uppercase font-semibold mb-2" style={{ color: '#000000' }}>
+                      <div className="text-xs uppercase font-semibold mb-2" style={{ color: '#737373' }}>

Apply this to all affected lines (268, 282, 322, 336, 355).

Also applies to: 282-283, 322-323, 336-337, 355-355

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/ConfigModal.tsx` around lines 268 - 269, Update the
inconsistent label color in the ConfigModal component: replace the hard-coded
color '#000000' used in the section label elements (the div rendering "Knowledge
Base IDs ({tool.type})" and the other affected label divs) with the muted gray
'#737373' so labels match the rest of the modal; locate these occurrences in the
ConfigModal.tsx JSX (the divs rendering section labels around the Knowledge
Base, Tool, and other sections) and change their inline style color to '#737373'
for all mentioned lines (268, 282, 322, 336, 355).

206-217: Potential duplicate display of knowledge base IDs.

The top-level "Knowledge Base IDs" section (lines 206-217) displays the deduplicated IDs aggregated from all sources, while the per-tool sections (lines 264-295) display each tool's knowledge_base_ids individually. This means the same IDs could appear twice in the UI.

If this redundancy is intentional for clarity (showing both summary and detail views), consider adding a visual distinction or labeling to clarify the relationship. Otherwise, you may want to show only one or the other.

Also applies to: 264-295

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/ConfigModal.tsx` around lines 206 - 217, The UI currently
shows aggregated knowledge_base_ids at the top and then each tool's
knowledge_base_ids again, causing duplicate IDs; in ConfigModal update the
per-tool rendering (the code that maps over tool.knowledge_base_ids) to filter
out any IDs already present in configVersionInfo?.knowledge_base_ids (e.g.,
tool.knowledge_base_ids.filter(id =>
!configVersionInfo?.knowledge_base_ids.includes(id))) and hide the per-tool
block when the filtered list is empty, or alternatively remove the top-level
aggregated block—use the symbols configVersionInfo, knowledge_base_ids,
tool.knowledge_base_ids and the ConfigModal render logic to locate and apply
this change.
app/components/prompt-editor/DiffView.tsx (1)

215-219: Label update and styling look good, but only the first knowledge base ID is displayed.

The terminology update from "Vector Store" to "Knowledge Base" aligns with the PR objectives. However, line 218 only displays tool.knowledge_base_ids[0], which will silently ignore any additional IDs if a tool has multiple knowledge bases configured.

Consider displaying all IDs if the data model supports multiple:

🔧 Suggested improvement to show all IDs
                              {tool.knowledge_base_ids && (
                                <div className="text-xs mt-1" style={{ color: colors.text.secondary }}>
-                                  Knowledge Base: {tool.knowledge_base_ids[0]}
+                                  Knowledge Base: {tool.knowledge_base_ids.join(', ')}
                                </div>
                              )}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/prompt-editor/DiffView.tsx` around lines 215 - 219, The UI
currently only renders the first knowledge base id via
tool.knowledge_base_ids[0], which ignores additional IDs; update the DiffView
component where tool.knowledge_base_ids is rendered to iterate over the array
(e.g., map or join) and display all IDs (either as a comma-separated string or a
small list) while preserving the existing styling container (the div that
currently shows "Knowledge Base: ..."); ensure you guard for empty/null
tool.knowledge_base_ids and keep the label "Knowledge Base" (or pluralize when
more than one) so multiple IDs are visible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@app/components/ConfigModal.tsx`:
- Around line 268-269: Update the inconsistent label color in the ConfigModal
component: replace the hard-coded color '#000000' used in the section label
elements (the div rendering "Knowledge Base IDs ({tool.type})" and the other
affected label divs) with the muted gray '#737373' so labels match the rest of
the modal; locate these occurrences in the ConfigModal.tsx JSX (the divs
rendering section labels around the Knowledge Base, Tool, and other sections)
and change their inline style color to '#737373' for all mentioned lines (268,
282, 322, 336, 355).
- Around line 206-217: The UI currently shows aggregated knowledge_base_ids at
the top and then each tool's knowledge_base_ids again, causing duplicate IDs; in
ConfigModal update the per-tool rendering (the code that maps over
tool.knowledge_base_ids) to filter out any IDs already present in
configVersionInfo?.knowledge_base_ids (e.g., tool.knowledge_base_ids.filter(id
=> !configVersionInfo?.knowledge_base_ids.includes(id))) and hide the per-tool
block when the filtered list is empty, or alternatively remove the top-level
aggregated block—use the symbols configVersionInfo, knowledge_base_ids,
tool.knowledge_base_ids and the ConfigModal render logic to locate and apply
this change.

In `@app/components/prompt-editor/DiffView.tsx`:
- Around line 215-219: The UI currently only renders the first knowledge base id
via tool.knowledge_base_ids[0], which ignores additional IDs; update the
DiffView component where tool.knowledge_base_ids is rendered to iterate over the
array (e.g., map or join) and display all IDs (either as a comma-separated
string or a small list) while preserving the existing styling container (the div
that currently shows "Knowledge Base: ..."); ensure you guard for empty/null
tool.knowledge_base_ids and keep the label "Knowledge Base" (or pluralize when
more than one) so multiple IDs are visible.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6299788a-907d-485d-909c-caaab70a98e3

📥 Commits

Reviewing files that changed from the base of the PR and between 99a4610 and 9cc6b14.

📒 Files selected for processing (6)
  • app/components/ConfigCard.tsx
  • app/components/ConfigDrawer.tsx
  • app/components/ConfigModal.tsx
  • app/components/prompt-editor/ConfigEditorPane.tsx
  • app/components/prompt-editor/CurrentConfigTab.tsx
  • app/components/prompt-editor/DiffView.tsx
✅ Files skipped from review due to trivial changes (1)
  • app/components/ConfigCard.tsx
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/components/prompt-editor/ConfigEditorPane.tsx

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/evaluations/[id]/page.tsx (1)

120-120: ⚠️ Potential issue | 🔴 Critical

Add toast to the useCallback dependency array.

The toast object returned by useToast() is unstable—it's created as a new object on each render in ToastProvider (line 62: value={{ ... }}). Even though the individual methods are wrapped in useCallback, the containing object itself is recreated on each render. This causes fetchJobDetails to be unnecessarily recreated and defeats the purpose of useCallback. Either add toast to the dependencies at line 120, or memoize the context value in ToastProvider.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx at line 120, The fetchJobDetails useCallback
is missing the unstable toast object from useToast() in its dependency array;
update the dependency array for the useCallback that defines fetchJobDetails to
include toast (or alternatively memoize the ToastProvider context value), so add
toast to the array alongside apiKeys, selectedKeyId, jobId, exportFormat;
reference useToast(), fetchJobDetails, and ToastProvider when making the change.
🧹 Nitpick comments (2)
app/evaluations/[id]/page.tsx (2)

832-869: Consider adding accessibility attributes to the modal.

The modal works functionally but lacks some accessibility features. At minimum, consider adding role="dialog" and aria-modal="true" for screen reader users:

♻️ Proposed improvement
       {showNoTracesModal && (
         <div
           className="fixed inset-0 z-50 flex items-center justify-center"
           style={{ backgroundColor: 'rgba(0, 0, 0, 0.5)' }}
           onClick={() => setShowNoTracesModal(false)}
+          role="dialog"
+          aria-modal="true"
+          aria-labelledby="no-traces-modal-title"
         >
           <div
             className="rounded-lg shadow-lg p-6 max-w-md mx-4"
             style={{ backgroundColor: colors.bg.primary }}
             onClick={(e) => e.stopPropagation()}
           >
             <div className="mb-4">
-              <h3 className="text-lg font-semibold mb-2" style={{ color: colors.text.primary }}>
+              <h3 id="no-traces-modal-title" className="text-lg font-semibold mb-2" style={{ color: colors.text.primary }}>
                 No Langfuse Traces Available
               </h3>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx around lines 832 - 869, The modal rendered
when showNoTracesModal is true should include accessibility attributes: add
role="dialog" and aria-modal="true" to the outer modal container (the div that
currently has className="fixed inset-0..."), and add aria-labelledby pointing to
the h3 text (or give the h3 an id) and aria-describedby pointing to the
paragraph (or give it an id) so screen readers can announce the title and
description; also ensure the OK button (which uses setShowNoTracesModal) is
focusable on open and consider moving keyboard handling (Escape to close) to the
dialog container to improve keyboard accessibility.

577-586: Consider deduplicating knowledge base IDs.

If multiple tools reference the same knowledge base, the ID will appear multiple times in the display. A simple Set can deduplicate:

♻️ Proposed improvement
                          {configVersionInfo.tools && configVersionInfo.tools.length > 0 && (() => {
-                            const knowledgeBaseIds = configVersionInfo.tools
+                            const knowledgeBaseIds = [...new Set(configVersionInfo.tools
                               .filter(tool => Array.isArray(tool.knowledge_base_ids) && tool.knowledge_base_ids.length > 0)
-                              .flatMap(tool => tool.knowledge_base_ids);
+                              .flatMap(tool => tool.knowledge_base_ids))];
                             return knowledgeBaseIds.length > 0 ? (
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/evaluations/`[id]/page.tsx around lines 577 - 586, The displayed KB list
can contain duplicates because knowledgeBaseIds is built by flatMapping
tool.knowledge_base_ids; deduplicate before rendering by converting the array to
a Set (or using Array.from(new Set(...))) so only unique IDs are joined and
shown; update the logic around configVersionInfo.tools -> flatMap(...) ->
knowledgeBaseIds to produce a uniqueKnowledgeBaseIds (or reuse knowledgeBaseIds
but deduplicated) and then use uniqueKnowledgeBaseIds.join(', ') when rendering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/evaluations/`[id]/page.tsx:
- Around line 397-400: The early-return branch that checks hasTraces sets
setShowNoTracesModal(true) but never resets the resync state, leaving the Resync
button stuck; update the handler (the function containing the hasTraces check)
to call setIsResyncing(false) before returning so isResyncing is cleared when no
traces are found (i.e., add setIsResyncing(false) immediately before the
existing setShowNoTracesModal(true) return path).

---

Outside diff comments:
In `@app/evaluations/`[id]/page.tsx:
- Line 120: The fetchJobDetails useCallback is missing the unstable toast object
from useToast() in its dependency array; update the dependency array for the
useCallback that defines fetchJobDetails to include toast (or alternatively
memoize the ToastProvider context value), so add toast to the array alongside
apiKeys, selectedKeyId, jobId, exportFormat; reference useToast(),
fetchJobDetails, and ToastProvider when making the change.

---

Nitpick comments:
In `@app/evaluations/`[id]/page.tsx:
- Around line 832-869: The modal rendered when showNoTracesModal is true should
include accessibility attributes: add role="dialog" and aria-modal="true" to the
outer modal container (the div that currently has className="fixed inset-0..."),
and add aria-labelledby pointing to the h3 text (or give the h3 an id) and
aria-describedby pointing to the paragraph (or give it an id) so screen readers
can announce the title and description; also ensure the OK button (which uses
setShowNoTracesModal) is focusable on open and consider moving keyboard handling
(Escape to close) to the dialog container to improve keyboard accessibility.
- Around line 577-586: The displayed KB list can contain duplicates because
knowledgeBaseIds is built by flatMapping tool.knowledge_base_ids; deduplicate
before rendering by converting the array to a Set (or using Array.from(new
Set(...))) so only unique IDs are joined and shown; update the logic around
configVersionInfo.tools -> flatMap(...) -> knowledgeBaseIds to produce a
uniqueKnowledgeBaseIds (or reuse knowledgeBaseIds but deduplicated) and then use
uniqueKnowledgeBaseIds.join(', ') when rendering.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7c2384f3-2f3c-4d80-ac84-9e7c279c5839

📥 Commits

Reviewing files that changed from the base of the PR and between 9cc6b14 and 769588a.

📒 Files selected for processing (2)
  • app/components/prompt-editor/ConfigEditorPane.tsx
  • app/evaluations/[id]/page.tsx
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/components/prompt-editor/ConfigEditorPane.tsx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants