Add PR code-size benchmark comment workflow#267
Conversation
📐 Generated code size & instructions
Code size (compiled
|
| Target · Compiler | Code size | Instructions | many_conditionals Ok() |
|---|---|---|---|
| x86-64 · gcc | 12762 B | 2373 | 2411 B |
| x86-64 · clang | 6259 B | 918 | 1460 B |
| ARM Cortex-M4 · gcc | 7112 B | 1817 | 1964 B |
| MicroBlaze · gcc | 11636 B | 2640 | 14104 B |
-O2 / -O0
-O2
| Target · Compiler | Code size | Instructions | many_conditionals Ok() |
|---|---|---|---|
| x86-64 · gcc | 5110 B | 881 | 6861 B |
| x86-64 · clang | 3143 B | 499 | 4608 B |
| ARM Cortex-M4 · gcc | 3329 B | 871 | 3524 B |
| MicroBlaze · gcc | 6492 B | 1354 | 10280 B |
-O0
| Target · Compiler | Code size | Instructions | many_conditionals Ok() |
|---|---|---|---|
| x86-64 · gcc | 99755 B | 19650 | 6471 B |
| x86-64 · clang | 76151 B | 13924 | 7304 B |
| ARM Cortex-M4 · gcc | 59170 B | 19317 | 6816 B |
| MicroBlaze · gcc | 107134 B | 23920 | 8316 B |
Compilers: x86-64 gcc 13.3.0 · x86-64 clang 18.1.3 · ARM Cortex-M4 gcc 13.2.1 · MicroBlaze gcc 14.3.0. benchmark.emb is a fixed fixture (Ok()+CopyFrom over every top-level view); it is pulled forward from head, so only the code generator under test varies between base and head.
13127bc to
bcfb21f
Compare
On each pull request that can affect generated code, compile a fixed all-features benchmark schema and the many_conditionals Ok() highlight across x86-64 (gcc + clang), ARM Cortex-M4 (gcc) and MicroBlaze (gcc) at -Os/-O2/-O0, and post a sticky comment with the .text size and objdump instruction count, compared against the merge-base. The benchmark schema is held fixed and pulled forward from the PR head, so only the code generator under test varies between base and head; adding or editing other test .emb files cannot move the numbers. clang has no MicroBlaze back end, so that target is gcc-only. Adds testdata/benchmark.emb (fixed fixture), scripts/size_bench.py (matrix compile + size/objdump measurement to JSON), scripts/size_comment.py (renders the sticky comment), and .github/workflows/code-size.yml (the pull_request workflow).
bcfb21f to
4b73fbc
Compare
|
This PR was based on ideas from @robrussell. I thought the cleanest way to implement/inform this kind of benchmark was to have an action runner just post a comment in PRs for FYI purposes as well as to tip off reviewers to any major regressions. Performance benchmarks are in instruction count which is meaningless for x86_64, probably, but gives a fuzzy idea of what's going on (no way of doing a real benchmark as github's action runner is pretty noisy in this regard). Open to improvements to the comment format if any are obvious. |
Third PR in the embedded code-size chain (stacked after #252 and #265). Adds a GitHub Actions workflow that comments on each PR with the size of the generated
Ok()code, so generated-code-size regressions are visible during review.What it does
On every pull request that can affect generated code,
.github/workflows/code-size.yml:gcc-arm-none-eabi, MicroBlaze via the matching Bootlin toolchain, and uses the host x86-64g++.scripts/profile_tool.py(from Add embedded code-size benchmarking scripts #265) comparing the merge-base vs the PR head across-Os/-O2/-O0.Design notes
git merge-base <base> <head>so commits that land on the base branch after a PR branches aren't misattributed to the PR.GITHUB_TOKEN); same-repo branches work. Aworkflow_runsplit is the fork-safe upgrade if it's ever needed.profile_tool.py+embedded_bench.sh.Status
Draft — stacks on #265 (
me/embedded-size-bench); rebase together with #252 → #265 before marking ready.Validated locally end-to-end with all three toolchains (
profile_tool.py --revisions <base> <head>, ARM / MicroBlaze / x86-64 × Os/O2/O0).