feat(cache): closes #686 — HIR-addressable per-module object cache (Plan C)#688
feat(cache): closes #686 — HIR-addressable per-module object cache (Plan C)#688TheHypnoo wants to merge 9 commits into
Conversation
82b6b3e to
90b4449
Compare
Rebased onto upstream main (v0.5.811 → v0.5.818). Upstream added
`Expr::AsyncStepDone { value, step_closure }` (PerryTS#688 P2 work, commit
b216cef) which the exhaustive match in `stable_hash.rs` did not
cover, breaking CI with E0004. Added the variant with tag 442 and
walked both children. Other small rustfmt-driven whitespace fixes
on touched files.
|
CI status update after rebasing on `main` (now at v0.5.818) and adding the missing `Expr::AsyncStepDone` arm to the exhaustive match in `stable_hash.rs` (the variant landed in upstream commit `b216cef1` while this PR was open). The remaining red jobs are all pre-existing on `main`, not introduced by this PR:
Confirmed by inspecting the most recent `Tests` workflow run on `main` itself (sha `6640820a`) — same failures: `lint`, `api-docs-drift`, all `doc-tests` variants. The `cargo-test` failure is newer (post-`6640820a`, from the v0.5.810/811 manifest landings). `harmonyos-smoke` is the one job in this matrix that exercises this PR's code path — it's green. Files changed by this PR (9 total, none of them in the above failure list): ``` Local validation on this branch:
Happy to fold in maintainer fixes for the upstream-broken jobs if that's preferred over waiting for a separate cleanup PR — just let me know. |
|
@TheHypnoo Sorry, main is "corrupted" right now by all the updates flying in. Will clean it by EOD and merge this :) |
…ache (Plan C) Replace the V2.2 source-bytes hash in the per-module object cache key with a deterministic fingerprint of the post-transform HIR that `perry_codegen::compile_module` actually consumes. Formatter-only edits, comment changes, quote-style rewrites, and any other source change that lowers to identical final HIR now hit the cached `.o`. New `perry_hir::stable_hash` module walks the HIR with an exhaustive match (no wildcards) over every variant, so adding a new HIR variant breaks the build and forces the author to assign a stable tag and walk children. The only HashMap reachable from `Module` (`ObjectType.properties`) is sorted by key before emit. Hash is computed inside the rayon per-module job, after every HIR-mutating pass has run. `compile_module(&Module, ...)` is documented as load-bearing — switching to `&mut` would have to also move the hash to AFTER codegen. Diagnostics: `PERRY_DEV_VERBOSE=1` now prints per-miss `hir=<...> key=<...>` lines; `PERRY_CACHE_DEBUG_HIR=1` dumps the post-transform HIR of every miss to `.perry-cache/debug/<key>.txt` so two miss-dumps can be diffed. Tests: new `stable_hash` unit tests (5) cover insertion-order independence, behavior-vs-cosmetic differentiation, and per-field metadata sensitivity. New `cross_process.rs` integration test spawns an example binary twice via `cargo run --example` and asserts byte-identical stdout — the only way to catch a forgotten HashMap sort, since Rust's RandomState only randomizes between processes. `scripts/run_cache_tests.sh` gains cosmetic-edit and behavior-edit steps end-to-end.
Rebased onto upstream main (v0.5.811 → v0.5.818). Upstream added
`Expr::AsyncStepDone { value, step_closure }` (PerryTS#688 P2 work, commit
b216cef) which the exhaustive match in `stable_hash.rs` did not
cover, breaking CI with E0004. Added the variant with tag 442 and
walked both children. Other small rustfmt-driven whitespace fixes
on touched files.
90b4449 to
223b5dc
Compare
Two `main` merges into this branch (v0.5.818 → v0.5.867) added four HIR variants the exhaustive walker did not yet cover, breaking the build with E0004: - Expr::CurrentStepClosure (tag 443) — PerryTS#691 Phase 2 async-step opt - Expr::AsyncFirstCall { step_closure } (tag 444) — same series - Expr::TaggedTemplateStrings { cooked, raw } (tag 445) — tagged template literals - Expr::TemplateRaw(Box<Expr>) (tag 446) — strings.raw accessor This is exactly the regression gate the no-wildcard match in stable_hash.rs is designed to catch: any HIR addition forces an explicit tag assignment + child walk decision.
Summary
Closes #686. Replaces the V2.2 source-bytes hash in the per-module object cache key with a deterministic fingerprint of the post-transform HIR that
perry_codegen::compile_moduleactually consumes. Formatter-only edits, comment changes, quote-style rewrites, and any other source change that lowers to identical final HIR now hit the cached.o.perry_hir::stable_hashmodule withhash_module(&Module) -> u64and a streaminghash_module_with<H: StableHasher>variant. Walks the HIR with an exhaustive match (no_ =>wildcards) over every variant, so adding a new HIR variant breaks the build and forces the author to assign a stable tag and walk children. The onlyHashMapreachable fromModule(ObjectType.properties) is sorted by key before emit.compute_object_cache_key's second parameter renamedsource_hash → hir_hash; field tag\"src\" → \"hir\"so any pre-Plan C: HIR-addressable object cache for incremental builds #686 cache entries cleanly miss.compile.rs:3905, after every HIR-mutating pass has run.compile_module(&Module, ...)documented as load-bearing — switching to&mutwould have to also move the hash to AFTER codegen.CompilationContext.module_source_hashesand the per-moduledjb2_hash(source.as_bytes())call incollect_modules.rsremoved (no remaining consumers).Diagnostics
PERRY_DEV_VERBOSE=1now prints per-misscache miss: <name> hir=<...> key=<...>lines on top of the existing aggregatecodegen cache: H/T hit (M miss)summary.PERRY_CACHE_DEBUG_HIR=1dumps the post-transform HIR of every miss to.perry-cache/debug/<key>.txtso two miss-dumps can bediff-ed to find the divergent field.Tests
perry_hir::stable_hashunit tests covering insertion-order independence, behavior-vs-cosmetic differentiation, per-field metadata sensitivity, and a pinned in-process determinism check.crates/perry-hir/tests/cross_process.rs+examples/stable_hash_cross_process.rs: spawns the example binary twice viacargo run --exampleand asserts byte-identical stdout. This is the only way to catch a forgotten HashMap sort, since Rust'sRandomStateonly randomizes iteration BETWEEN processes, not within one.object_cache_tests::key_changes_with_source_hashrenamed tokey_changes_with_hir_hashwith a comment documenting the inverse case ("same source bytes, different HIR" is covered bybuild_id, not by this hash). All 18 cache tests stay green.scripts/run_cache_tests.shgains two new steps after the existing cold/warm/partial/rewarm cycle:Test plan
perry-codegen::manifest_consistency::every_dispatch_entry_has_manifest_counterpartfailure from v0.5.810 (unrelated to this PR — verified by reproducing onmain)Out of scope (per #686 non-goals)
No JIT, no interpreter, no separate
perry devbackend, no Plan B (linked-binary cache), no effort to keep the HIR hash minimal across mono bookkeeping changes — over-invalidation whenmonomorphize_modulerewrites a node into something codegen would have produced identical bytes for is the accepted trade for keeping the hash a simple post-transform fingerprint.Notes for the maintainer
Per CONTRIBUTING.md, this PR does not touch `Cargo.toml` version, the `Current Version:` line in `CLAUDE.md`, or `CHANGELOG.md` — left for the maintainer to fold in at merge.