Test infrastructure and results for leanprover/skills.
This repo is designed to be cloned as a subdirectory of the skills repo:
cd /path/to/skills
git clone https://github.com/leanprover/skills-testing.gitAll scripts should be run from the skills repo root:
# Run all tests for a skill (with and without skill, then compare)
skills-testing/scripts/run-skill-tests lean-proof
# Run all tests for all skills
skills-testing/scripts/run-all-tests
# Judge completed runs
skills-testing/scripts/judge-all
# View results
skills-testing/scripts/summary --latestresults/<skill>/<test>/<timestamp>/
with-skill.json # claude-wrapper output (result, tools, skills, usage)
without-skill.json # claude-wrapper output without plugin
with-skill.jsonl # full conversation log (JSONL)
without-skill.jsonl # full conversation log (JSONL)
judge.json # verdict: satisfactory / not_needed / needs_improvement
claudeCLIjqyq