fix: prune orphaned entries from actions-lock.json on full compile#39905
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot only activate this mode when compiling all workflows, disable when targetting a single workflow. |
This is already the case. |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
❌ Design Decision Gate 🏗️ failed during design decision gate check. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Test Quality Sentinel completed test quality analysis. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
There was a problem hiding this comment.
Pull request overview
This PR addresses long-lived stale entries in .github/aw/actions-lock.json by tracking which repo@version cache keys are referenced during a full-directory compile and pruning cache entries that are no longer referenced before saving.
Changes:
- Added “used cache key” tracking to
ActionResolverand exposed it for post-processing. - Added
ActionCache.PruneOrphanedEntriesand invoked it only for full-directory compilation post-processing. - Updated tests and regenerated lock artifacts (
actions-lock.json, compiled.lock.yml).
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/compiler_types.go | Exposes the compiler’s shared ActionResolver for post-processing. |
| pkg/workflow/action_resolver.go | Tracks referenced cache keys during resolution for later pruning. |
| pkg/workflow/action_resolver_test.go | Adds coverage for used-cache-key tracking (cache-hit case). |
| pkg/workflow/action_cache.go | Adds PruneOrphanedEntries to delete unreferenced action entries. |
| pkg/workflow/action_cache_test.go | Adds pruning behavior tests (basic, empty set no-op, none/all orphaned). |
| pkg/cli/compile_post_processing.go | Wires pruning logic into post-processing for full-directory compiles. |
| pkg/cli/compile_pipeline.go | Calls orphan pruning only in runPostProcessingForDirectory. |
| actions/setup-cli/install.sh | Updates (but currently misstates) the sync/source-of-truth comment. |
| .github/workflows/daily-formal-spec-verifier.lock.yml | Regenerated compiled workflow lock output. |
| .github/aw/actions-lock.json | Regenerated action cache after pruning orphaned entries. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 10/10 changed files
- Comments generated: 3
| // Check cache first using the pre-computed key to avoid a second key allocation. | ||
| if sha, found := r.cache.GetByCacheKey(cacheKey); found { | ||
| resolverLog.Printf("Cache hit for %s@%s: %s", repo, version, sha) | ||
| r.usedCacheKeys[cacheKey] = true | ||
| return sha, nil |
| // GetUsedCacheKeys returns the set of cache keys (in "repo@version" format) that | ||
| // were successfully resolved from the cache or written to the cache during this run. | ||
| // These represent the action pins actually referenced by the compiled workflows. | ||
| func (r *ActionResolver) GetUsedCacheKeys() map[string]bool { | ||
| return r.usedCacheKeys | ||
| } |
| set +o histexpand | ||
|
|
||
| # Kept in sync with ../../install-gh-aw.sh — edit that file, then copy here. | ||
| # Kept in sync with actions/setup-cli/install.sh — edit this file, then copy to that path. |
🧪 Test Quality Sentinel Report✅ Test Quality Score: 90/100 — Excellent
📊 Metrics & Test Classification (5 tests analyzed)
Go: 5 ( Score breakdown:
Verdict
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose and /tdd — 4 observations, all non-blocking. This is a well-constructed fix.
📋 Key Themes & Highlights
Key Themes
- Encapsulation leak (
GetUsedCacheKeysreturns the live internal map) - Undocumented invariant (embedded-pin path safely omits tracking, but for a non-obvious reason that should be documented)
- Incomplete test coverage (the
dirtyflag — the save-to-disk gate — is not asserted after pruning) - Silent guard (nil-resolver early return emits no diagnostic log)
Positive Highlights
- ✅ The empty-set no-op guard in
PruneOrphanedEntriesis a solid safety net — avoids accidentally wiping the cache if the resolver never ran - ✅ Pruning is correctly restricted to
runPostProcessingForDirectory; the call-site is clearly explained in both the PR description and code comments - ✅ Four focused test cases in
action_cache_test.gocover the canonical bump scenario, the empty-set guard, the all-referenced no-op, and the all-orphaned case - ✅ The runtime evidence (90 orphaned entries removed from
.github/aw/actions-lock.json) makes the before/after impact immediately legible - ✅
GetSharedActionResolverreuses the existing lazy-init pattern fromGetSharedActionCache— consistent API
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer
Comments that could not be inline-anchored
pkg/workflow/action_resolver.go:37
[/diagnose] GetUsedCacheKeys() returns the live internal map, not a copy — callers can accidentally mutate the resolver's state.
In the current code path (pruneOrphanedActionCacheEntries) the map is only read, so no bug today. But returning internal state directly is a well-known source of subtle mutations.
<details>
<summary>💡 Suggested fix</summary>
func (r *ActionResolver) GetUsedCacheKeys() map[string]bool {
result := make(map[string]bool, len(r.usedCacheKeys))
for k, v …
</details>
<details><summary>pkg/workflow/action_resolver.go:87</summary>
**[/diagnose]** The embedded-pin path returns without recording the key in `usedCacheKeys` — only the comment for `cache.Set` explains the intentional skip, leaving the tracking omission undocumented.
This is safe **today** because the disk-cache check (line 56) always fires first: if the key is in the cache it is tracked there; if it is not in the cache there is nothing to prune. But the invariant is non-obvious and could silently break if the cache-check and embedded-pin code blocks are ever…
</details>
<details><summary>pkg/workflow/action_cache_test.go:921</summary>
**[/tdd]** The test verifies entry count and key presence, but does not assert that `cache.dirty` is set after pruning — so a regression that breaks the save-to-disk path would go undetected here.
`PruneOrphanedEntries` sets `c.dirty = true` on every deletion; the test should confirm that.
<details>
<summary>💡 Suggested assertion</summary>
```go
// After the existing assertions:
if !cache.dirty {
t.Error("Expected cache.dirty to be true after pruning an orphaned entry")
}Similarly…
pkg/cli/compile_post_processing.go:316
[/diagnose] When resolver == nil the function returns silently — no log line, no signal that pruning was skipped. This makes it hard to diagnose why orphaned entries were not removed during a debugging session.
<details>
<summary>💡 Suggested addition</summary>
resolver := compiler.GetSharedActionResolver()
if resolver == nil {
compilePostProcessingLog.Printf("Skipping orphaned entry pruning: no shared action resolver available")
return
}This is consistent with how o…
There was a problem hiding this comment.
REQUEST_CHANGES — two issues must be fixed before merging
The approach is sound, but there are two defects in the current implementation.
### Blocking issues
1. Prune runs even on partial compile failure (compile_pipeline.go:611)
runPostProcessingForDirectory does not receive errorCount, so pruneOrphanedActionCacheEntries fires unconditionally even when one or more workflows failed to compile. A workflow that fails never contributes to usedCacheKeys, so its legitimate action pins get deleted from actions-lock.json. The stated invariant — "only safe after a full-directory compilation (all workflows compiled)" — is not enforced. Fix: pass errorCount and skip the prune when it is non-zero.
2. GetUsedCacheKeys() exposes the internal map (action_resolver.go:37)
The method returns the live usedCacheKeys reference. Callers can silently mutate the resolver's tracking state, causing incorrect prune decisions. Return a copy instead.
### Non-blocking note
The PR description's "After" example shows microsoft/apm-action@v1.9.1 surviving the prune, but the actual actions-lock.json diff removes that entry too. If that action is genuinely unused by any workflow in the directory the pruning is correct — the description just uses a misleading hypothetical. Worth updating the example or adding a note so the behaviour is clearly documented.
🔎 Code quality review by PR Code Quality Reviewer
| // Prune orphaned entries — entries for action versions no longer referenced | ||
| // by any workflow in the directory (e.g. old pins left after a version bump). | ||
| // Safe to call only after a full-directory compilation (all workflows compiled). | ||
| pruneOrphanedActionCacheEntries(compiler, actionCache) |
There was a problem hiding this comment.
Orphaned-entry pruning runs even when workflows fail to compile, silently deleting their legitimate action pins.
The errorCount variable (declared at line 301, incremented at line 338) is never passed to runPostProcessingForDirectory, so pruneOrphanedActionCacheEntries fires unconditionally — including when some workflows failed to compile. When a workflow fails (e.g. a template syntax error), ResolveSHA is never called for its uses: entries, so those keys are absent from usedCacheKeys. PruneOrphanedEntries then deletes them from actions-lock.json.
On the next successful compile those SHAs must be re-fetched from GitHub. More importantly, this violates the stated invariant: "Safe to call only after a full-directory compilation (all workflows compiled)" — currently that means all workflows attempted, not all workflows succeeded.
💡 Suggested fix
Thread errorCount into runPostProcessingForDirectory and gate the orphan prune:
// compile_pipeline.go call site
if err := runPostProcessingForDirectory(ctx, compiler, workflowDataList, config,
workflowsDir, gitRoot, successCount, errorCount); err != nil {// runPostProcessingForDirectory signature
func runPostProcessingForDirectory(
ctx context.Context,
compiler *workflow.Compiler,
workflowDataList []*workflow.WorkflowData,
config CompileConfig,
workflowsDir string,
gitRoot string,
successCount int,
errorCount int, // <-- add
) error {// pruneOrphanedActionCacheEntries in compile_post_processing.go
func pruneOrphanedActionCacheEntries(
compiler *workflow.Compiler,
actionCache *workflow.ActionCache,
errorCount int,
) {
if errorCount > 0 {
// usedCacheKeys is incomplete: at least one workflow failed to compile
// so its pins were never recorded. Pruning now would incorrectly
// evict still-valid cache entries for the broken workflows.
return
}
...
}| // were successfully resolved from the cache or written to the cache during this run. | ||
| // These represent the action pins actually referenced by the compiled workflows. | ||
| func (r *ActionResolver) GetUsedCacheKeys() map[string]bool { | ||
| return r.usedCacheKeys |
There was a problem hiding this comment.
GetUsedCacheKeys() returns the live internal map — callers can mutate the resolver's tracking state.
Returning r.usedCacheKeys directly hands the caller a reference to the map that ResolveSHA writes to. Any caller that adds or removes keys from the returned map silently corrupts which entries survive PruneOrphanedEntries. If compilation ever becomes concurrent, iterating the returned map while another goroutine calls ResolveSHA is also an unsynchronized access.
💡 Suggested fix
Return a shallow copy:
func (r *ActionResolver) GetUsedCacheKeys() map[string]bool {
out := make(map[string]bool, len(r.usedCacheKeys))
for k, v := range r.usedCacheKeys {
out[k] = v
}
return out
}This isolates the resolver's internal state from downstream callers and makes the API safe by default.
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (>100 new lines across 📄 Draft ADR committed:
📋 What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. ❓ Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
|
|
``
|
|
@copilot run pr-finisher skill |
|
@copilot please summarize the remaining blockers and the next fix-test step.
|
|
@copilot review all comments and address unresolved review feedback.
|
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Ran the pr-finisher flow. I addressed the actionable logic issues in this PR with commit |
| - name: Save activity report logs cache | ||
| if: ${{ always() }} | ||
| uses: actions/cache/save@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5 | ||
| uses: |
There was a problem hiding this comment.
Fixed in 759b479 — restored actions/cache/save@v5.0.5 entry to actions-lock.json and regenerated agentics-maintenance.yml with the correct uses: field.
| - name: Save forecast report logs cache | ||
| if: ${{ always() }} | ||
| uses: actions/cache/save@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5 | ||
| uses: |
There was a problem hiding this comment.
Fixed in 759b479 — restored actions/cache/save@v5.0.5 entry to actions-lock.json and regenerated agentics-maintenance.yml with the correct uses: field.
Compiler-generated actions (actions/cache/*, actions/checkout, actions/github-script, github/codeql-action/upload-sarif) are embedded in Go code rather than markdown workflows, so they won't be tracked as "used" during normal compilation. These actions must never be pruned because they're always potentially needed by the compiler. Updated PruneOrphanedEntries to skip compiler-generated action repos regardless of referenced set. Added test coverage and updated existing test that assumed actions/checkout could be pruned. Restored actions-lock.json from main to ensure embedded pins contain compiler-generated actions (they were previously pruned, breaking tests). Fixes the CI failure at: https://github.com/github/gh-aw/actions/runs/27724121408/job/82016442512?pr=39905 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Fixed in 7b0f6aa. The pruning logic now preserves compiler-generated actions (actions/cache/*, actions/checkout, actions/github-script, github/codeql-action/upload-sarif) even when they're not tracked as "used" by markdown workflows. These actions are embedded in Go code and must always be available. Restored actions-lock.json from main to repopulate embedded pins, added test coverage, and verified |
|
@copilot review the entries remove and make sure they are not used by runtime as well |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Fixed in commit above. The pruning logic now preserves all runtime-managed actions from |
|
@copilot ensure pins are sorted to ensure stability |
…tion Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
The sorting is already implemented and working correctly. Both action entries and container pins are sorted alphabetically via the Enhanced test coverage in commit above to explicitly verify container pin sorting order matches the pattern used in |
|
Please tighten the PR description to explicitly note the full-compile-only pruning guard.
|
After bumping an action version, the old
repo@versionentry lingered inactions-lock.jsonindefinitely —compileadded the new pin but never removed the old one, causing stale, unreferenced SHA entries to accumulate.Approach
Track which cache keys are actually used during a compilation run, then remove any entry not in that set when saving.
ActionResolver— track used keysAdded
usedCacheKeys map[string]boolpopulated on every cache hit and every newcache.Setcall. Exposed viaGetUsedCacheKeys().ActionCache.PruneOrphanedEntriesNew method: removes entries whose keys are absent from the provided referenced-key set. Empty set is a no-op — guards against accidentally wiping the cache if the resolver was never used.
Preserves compiler-managed actions: Actions embedded in Go code are never pruned because they're always potentially needed by the compiler, even if no markdown workflow currently uses them. This includes:
actions/cache/*,actions/checkout,actions/github-script,github/codeql-action/upload-sarifruntime_definitions.go(setup-node, setup-python, setup-java, setup-dotnet, setup-go, setup-uv, setup-ruby, setup-bun, setup-deno, erlef/setup-beam, haskell-actions/setup, gh-aw/actions/setup-cli) used by the runtime detection and auto-setup systemCompiler.GetSharedActionResolverNew public accessor so post-processing can retrieve the resolver's tracked key set.
pruneOrphanedActionCacheEntries(post-processing)Wires the above together. Called only in
runPostProcessingForDirectory(full directory compile) — deliberately excluded fromrunPostProcessing(specific-files compile), where other uncompiled workflows might still reference the entries.Effect
pr-sous-chef: please refresh the branch and keep working through the blocked merge state. Run: https://github.com/github/gh-aw/actions/runs/27726614160
> Generated by 👨🍳 PR Sous Chef · ◷