Add PR code-size benchmark comment workflow by AaronWebster · Pull Request #267 · google/emboss

AaronWebster · 2026-06-05T02:47:35Z

Third PR in the embedded code-size chain (stacked after #252 and #265). Adds a GitHub Actions workflow that comments on each PR with the size of the generated Ok() code, so generated-code-size regressions are visible during review.

What it does

On every pull request that can affect generated code, .github/workflows/code-size.yml:

Provisions ARM Cortex-M4 (STM32) via gcc-arm-none-eabi, MicroBlaze via the matching Bootlin toolchain, and uses the host x86-64 g++.
Runs scripts/profile_tool.py (from Add embedded code-size benchmarking scripts #265) comparing the merge-base vs the PR head across -Os / -O2 / -O0.
Posts/updates a single sticky comment with the size table and per-symbol deltas.

Design notes

Merge-base, not base tip: baselines against git merge-base <base> <head> so commits that land on the base branch after a PR branches aren't misattributed to the PR.
Comment-only: nothing is committed and no branches are written — purely a review-time signal (no persistent file, by design).
Forks: the comment step is skipped for fork PRs (read-only GITHUB_TOKEN); same-repo branches work. A workflow_run split is the fork-safe upgrade if it's ever needed.
Reuse: no new scripts — the workflow drives the existing profile_tool.py + embedded_bench.sh.

Status

Draft — stacks on #265 (me/embedded-size-bench); rebase together with #252 → #265 before marking ready.

Validated locally end-to-end with all three toolchains (profile_tool.py --revisions <base> <head>, ARM / MicroBlaze / x86-64 × Os/O2/O0).

github-actions · 2026-06-05T02:48:51Z

📐 Generated code size & instructions

0479f443e → 4b73fbc08 · smaller is better · ✅ no change

_{Code size (compiled .text bytes) and Instructions (objdump count) are totals for the generated code of the all-features benchmark.emb fixture. many_conditionals Ok() is the size (bytes) of the optimized conditional-validation method, a highlight. Δ vs the merge-base · 🟢 smaller / 🔴 larger.}

`-Os` (embedded)

Target · Compiler	Code size	Instructions	`many_conditionals Ok()`
x86-64 · gcc	12762 B	2373	2411 B
x86-64 · clang	6259 B	918	1460 B
ARM Cortex-M4 · gcc	7112 B	1817	1964 B
MicroBlaze · gcc	11636 B	2640	14104 B

-O2 / -O0

-O2

Target · Compiler	Code size	Instructions	`many_conditionals Ok()`
x86-64 · gcc	5110 B	881	6861 B
x86-64 · clang	3143 B	499	4608 B
ARM Cortex-M4 · gcc	3329 B	871	3524 B
MicroBlaze · gcc	6492 B	1354	10280 B

-O0

Target · Compiler	Code size	Instructions	`many_conditionals Ok()`
x86-64 · gcc	99755 B	19650	6471 B
x86-64 · clang	76151 B	13924	7304 B
ARM Cortex-M4 · gcc	59170 B	19317	6816 B
MicroBlaze · gcc	107134 B	23920	8316 B

_{Compilers: x86-64 gcc 13.3.0 · x86-64 clang 18.1.3 · ARM Cortex-M4 gcc 13.2.1 · MicroBlaze gcc 14.3.0. benchmark.emb is a fixed fixture (Ok()+CopyFrom over every top-level view); it is pulled forward from head, so only the code generator under test varies between base and head.}

On each pull request that can affect generated code, compile a fixed all-features benchmark schema and the many_conditionals Ok() highlight across x86-64 (gcc + clang), ARM Cortex-M4 (gcc) and MicroBlaze (gcc) at -Os/-O2/-O0, and post a sticky comment with the .text size and objdump instruction count, compared against the merge-base. The benchmark schema is held fixed and pulled forward from the PR head, so only the code generator under test varies between base and head; adding or editing other test .emb files cannot move the numbers. clang has no MicroBlaze back end, so that target is gcc-only. Adds testdata/benchmark.emb (fixed fixture), scripts/size_bench.py (matrix compile + size/objdump measurement to JSON), scripts/size_comment.py (renders the sticky comment), and .github/workflows/code-size.yml (the pull_request workflow).

AaronWebster · 2026-06-05T17:20:02Z

This PR was based on ideas from @robrussell.

I thought the cleanest way to implement/inform this kind of benchmark was to have an action runner just post a comment in PRs for FYI purposes as well as to tip off reviewers to any major regressions. Performance benchmarks are in instruction count which is meaningless for x86_64, probably, but gives a fuzzy idea of what's going on (no way of doing a real benchmark as github's action runner is pretty noisy in this regard).

Open to improvements to the comment format if any are obvious.

AaronWebster force-pushed the emboss/code-size-ci branch 2 times, most recently from 13127bc to bcfb21f Compare June 5, 2026 09:18

AaronWebster force-pushed the emboss/code-size-ci branch from bcfb21f to 4b73fbc Compare June 5, 2026 09:31

AaronWebster marked this pull request as ready for review June 5, 2026 17:15

AaronWebster requested a review from robrussell June 5, 2026 17:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PR code-size benchmark comment workflow#267

Add PR code-size benchmark comment workflow#267
AaronWebster wants to merge 1 commit into
me/embedded-size-benchfrom
emboss/code-size-ci

AaronWebster commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

AaronWebster commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AaronWebster commented Jun 5, 2026

What it does

Design notes

Status

Uh oh!

github-actions Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📐 Generated code size & instructions

-Os (embedded)

Uh oh!

AaronWebster commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 5, 2026 •

edited

Loading

`-Os` (embedded)