Skip to content

Classification: Implement Model Evaluation Endpoint #303

@nishika26

Description

@nishika26

Description
Build a model evaluation endpoint to assess the performance of fine-tuned models, compare results, and retain only the best-performing model.

Tasks

  • Implement logic to split data and convert the test set into JSONL format.

  • Generate predictions using one or more fine-tuned models on the test data.

  • Implement evaluation logic using the Matthews Correlation Coefficient (MCC) metric.

  • Identify the best-performing model based on evaluation results.

  • Automatically delete lower-performing models to maintain system efficiency.

  • Write test cases to validate the full evaluation and cleanup workflow.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

Status

Closed

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions