feat(ci): G3 perf regression gate + allowlist governance docs#293
Conversation
- G3 gate now compares criterion bencher output against git-tracked perf-baseline.json, failing on >15% regression (configurable). - New perf-baseline-update.yml workflow auto-generates baseline update PRs on main pushes that touch Rust sources. - Added check_perf_regression.cjs (gate) and generate_perf_baseline.cjs (baseline generation from bencher output). - Added "Determinism Allowlist Governance" section to RELEASE_POLICY.md documenting exemption criteria, approval requirements, and audit cadence. - Added cross-reference from ban-nondeterminism.sh to governance policy. Closes #280 Closes #287
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (8)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b94868b54f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const report = []; | ||
| let regressions = 0; | ||
|
|
||
| for (const name of benchNames) { |
There was a problem hiding this comment.
Fail when baseline benchmarks disappear from current run
Iterating only Object.keys(current) means the gate never checks benchmarks that exist in perf-baseline.json but are missing from the new output, so a renamed/removed benchmark can silently bypass regression enforcement and still report G3: PASSED. This undermines the accuracy of the regression gate in exactly the cases where benchmark coverage changes, so the comparison should include baseline-only entries (at least as a hard failure or explicit review-required state).
Useful? React with 👍 / 👎.
- Add step id to "Generate baseline JSON" and wire skip output into "Create baseline PR" if-condition (was dead GITHUB_OUTPUT write). - Add date prefix to baseline branch name to avoid short-SHA collisions. - Remove redundant git-diff guard (now handled by step skip logic). - Warn when baseline benchmarks are missing from current criterion run. - Remove unused path import from check_perf_regression.cjs.
… run Promote missing-benchmark detection from WARN to FAIL. A benchmark that exists in perf-baseline.json but is absent from the current criterion output now counts as a regression, preventing silent bypass via benchmark renames or removals. Resolution: update perf-baseline.json to remove the stale entry. Addresses chatgpt-codex-connector review comment on PR #293.
|
Fixed in 6c039da: Missing benchmarks now fail G3 gate instead of warn-only. A benchmark present in |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
perf-baseline.jsonwith a configurable regression threshold (default 15%). Structuredperf-report.jsonartifact uploaded alongside rawperf.log. A separateperf-baseline-update.ymlworkflow auto-generates baseline update PRs on main pushes that touch Rust sources.docs/RELEASE_POLICY.mddocumenting when exemptions are acceptable, approval requirements, and audit cadence. Cross-referenced fromban-nondeterminism.shheader.New files
scripts/check_perf_regression.cjsscripts/generate_perf_baseline.cjsperf-baseline.jsonfrom criterion bencher output.github/workflows/perf-baseline-update.ymlperf-baseline.jsonTest plan
check_perf_regression.cjsexits 0 with empty baseline (first-run behavior)Closes #280
Closes #287