MadGAA Lab
Popular repositories Loading
-
OSCE-Project
OSCE-Project PublicGenerative Adversarial Agents System for evaluating Objective Structured Clinical Examination capabilities
-
-
OSCE-AgentBeats-Leaderboard
OSCE-AgentBeats-Leaderboard PublicThe leaderboard for Objective Structured Clinical Examination Evaluator (OSCE-Project)
Python 1
-
OSCE-AgentBeats-Leaderboard-Baseline
OSCE-AgentBeats-Leaderboard-Baseline PublicForked from MadGAA-Lab/OSCE-AgentBeats-Leaderboard
The baseline submission to the leaderboard for Objective Structured Clinical Examination Evaluator (OSCE-Project)
Python
-
OSCE-Project-Stage2
OSCE-Project-Stage2 PublicForked from MadGAA-Lab/OSCE-Project
Generative Adversarial Agents System for evaluating Objective Structured Clinical Examination capabilities (Doctor Agent)
Python
-
FhirAgentEvaluator
FhirAgentEvaluator PublicForked from abasit/FhirAgentEvaluator
An evaluation for medical agents interacting interacting with FHIR databases for clinical tasks. Based upon MedAgentBench and FHIRAgentBench.
Python
Repositories
- Entropic-CRMArena-leaderboard-submission Public template Forked from rkstu/Entropic-CRMArena-leaderboard
A2A-compliant CRM benchmark with adversarial robustness testing (Schema Drift + Context Rot) and 7-Dimension scoring
MadGAA-Lab/Entropic-CRMArena-leaderboard-submission’s past year of commit activity - CRM-Agent-Phase2_dev Public
A CRM agent for Berkeley RDI AgentX–AgentBeats Phase 2 competition. Evaluated by the Entropic CRMArena green agent across 2,140 CRM tasks (22 categories) with schema drift and context rot resistance.
MadGAA-Lab/CRM-Agent-Phase2_dev’s past year of commit activity - MadGAA-Lab-Website Public
MadGAA-Lab/MadGAA-Lab-Website’s past year of commit activity - OSCE-Project-Stage2 Public Forked from MadGAA-Lab/OSCE-Project
Generative Adversarial Agents System for evaluating Objective Structured Clinical Examination capabilities (Doctor Agent)
MadGAA-Lab/OSCE-Project-Stage2’s past year of commit activity - OSCE-Project Public
Generative Adversarial Agents System for evaluating Objective Structured Clinical Examination capabilities
MadGAA-Lab/OSCE-Project’s past year of commit activity - FHIR-Agent Public
A LLM agents framework that deal with FHIR (Fast Healthcare Interoperability Resources) data.
MadGAA-Lab/FHIR-Agent’s past year of commit activity - FhirAgentEvaluator Public Forked from abasit/FhirAgentEvaluator
An evaluation for medical agents interacting interacting with FHIR databases for clinical tasks. Based upon MedAgentBench and FHIRAgentBench.
MadGAA-Lab/FhirAgentEvaluator’s past year of commit activity - OSCE-AgentBeats-Leaderboard Public
The leaderboard for Objective Structured Clinical Examination Evaluator (OSCE-Project)
MadGAA-Lab/OSCE-AgentBeats-Leaderboard’s past year of commit activity - OSCE-AgentBeats-Leaderboard-Baseline Public Forked from MadGAA-Lab/OSCE-AgentBeats-Leaderboard
The baseline submission to the leaderboard for Objective Structured Clinical Examination Evaluator (OSCE-Project)
MadGAA-Lab/OSCE-AgentBeats-Leaderboard-Baseline’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…