Skip to content

leanprover/skills-testing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Skills Testing

Test infrastructure and results for leanprover/skills.

This repo is designed to be cloned as a subdirectory of the skills repo:

cd /path/to/skills
git clone https://github.com/leanprover/skills-testing.git

Running tests

All scripts should be run from the skills repo root:

# Run all tests for a skill (with and without skill, then compare)
skills-testing/scripts/run-skill-tests lean-proof

# Run all tests for all skills
skills-testing/scripts/run-all-tests

# Judge completed runs
skills-testing/scripts/judge-all

# View results
skills-testing/scripts/summary --latest

Results structure

results/<skill>/<test>/<timestamp>/
  with-skill.json       # claude-wrapper output (result, tools, skills, usage)
  without-skill.json    # claude-wrapper output without plugin
  with-skill.jsonl      # full conversation log (JSONL)
  without-skill.jsonl   # full conversation log (JSONL)
  judge.json            # verdict: satisfactory / not_needed / needs_improvement

Dependencies

  • claude CLI
  • jq
  • yq

About

Test infrastructure and results for leanprover/skills

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages