Conversation
📝 WalkthroughWalkthroughSplits the evaluations API router into separate Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client/API Request
participant API as API Router
participant Svc as Evaluation Service
participant CRUD as CRUD Layer
participant DB as Database
participant Langfuse as External API (Langfuse)
Client->>API: GET /evaluations/:id[?get_trace_info]
API->>Svc: get_evaluation_with_scores(params)
Svc->>CRUD: get_evaluation_run(id)
CRUD->>DB: Query eval_run
DB-->>CRUD: eval_run (may include cached summary_scores/traces)
CRUD-->>Svc: eval_run
alt Need fetch or resync
Svc->>Langfuse: Fetch traces & summary_scores
Langfuse-->>Svc: langfuse_score (summary_scores + traces)
Svc->>Svc: Extract existing_summary_scores from eval_run.score
Svc->>Svc: Merge existing_summary_scores + langfuse_summary_scores (langfuse precedence)
Svc->>Svc: Build final score {merged_summary_scores, traces}
Svc->>CRUD: save_score(eval_run.id, final_score)
CRUD->>DB: Update eval_run.score
DB-->>CRUD: Confirm
CRUD-->>Svc: persisted
else Cached data sufficient
Svc->>Svc: Return cached eval_run.score
end
Svc-->>Client: Evaluation with merged scores (+ traces if requested)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
backend/app/crud/evaluations/core.py
Outdated
There was a problem hiding this comment.
Should we add strict type safety here?
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/crud/evaluations/core.py (1)
227-263: Avoid returning legacy scores withoutsummary_scores.If
eval_run.scorehastracesbut lackssummary_scores(legacy rows), the early return skips the Langfuse merge and returns an incompleteEvaluationScorepayload. Treat missingsummary_scoresas a cache miss so callers always receive the new schema.✅ Suggested fix
- has_traces = eval_run.score is not None and "traces" in eval_run.score - if not force_refetch and has_traces: + has_traces = eval_run.score is not None and "traces" in eval_run.score + has_summary_scores = ( + eval_run.score is not None and "summary_scores" in eval_run.score + ) + if not force_refetch and has_traces and has_summary_scores: logger.info( f"[get_or_fetch_score] Returning existing score | evaluation_id={eval_run.id}" ) return eval_run.score
Summary
Target issue is #520
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.