Use failure-specific issue titles in conclusion job#38615
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
🧠 Matt Pocock Skills Reviewer was skipped during the skills-based review. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #38615 does not have the implementation label and has 0 new lines of code in business logic directories (threshold 100). |
|
🧪 Test Quality Sentinel completed test quality analysis. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
There was a problem hiding this comment.
Pull request overview
This PR updates the conclusion-job failure issue creation flow to emit failure-specific GitHub issue titles (derived from detected failure signals) instead of always using a generic "[aw] <workflow> failed" title, improving triage and scanability.
Changes:
- Added
buildFailureIssueTitle(...)to derive a more specific issue title based on prioritized failure signals (timeout, missing safe outputs, report incomplete, budget/rate-limit, etc.). - Switched
main()to usebuildFailureIssueTitle(...)when creating/updating failure issues. - Expanded
handle_agent_failure.test.cjswith assertions for the new timeout / missing-safe-outputs / report-incomplete title variants, plus tighter env cleanup.
Show a summary per file
| File | Description |
|---|---|
| actions/setup/js/handle_agent_failure.cjs | Adds failure-aware issue title builder and wires it into the failure-issue creation path. |
| actions/setup/js/handle_agent_failure.test.cjs | Adds focused tests validating the new title selection behavior and improves env cleanup. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 2
| if (options.isTimedOut) return `[aw] ${workflowName} timed out`; | ||
| if (options.hasToolDenialsExceeded) return `[aw] ${workflowName} exceeded tool denial limit`; | ||
| if (options.hasCacheMissMisconfiguration) return `[aw] ${workflowName} has cache-memory path mismatch`; | ||
| if (options.hasReportIncomplete) return `[aw] ${workflowName} reported incomplete result`; | ||
| if (options.hasMissingSafeOutputs) return `[aw] ${workflowName} produced no safe outputs`; |
| if (options.hasStaleLockFileFailed) return `[aw] ${workflowName} has stale lock file`; | ||
| if (options.isTimedOut) return `[aw] ${workflowName} timed out`; | ||
| if (options.hasToolDenialsExceeded) return `[aw] ${workflowName} exceeded tool denial limit`; | ||
| if (options.hasCacheMissMisconfiguration) return `[aw] ${workflowName} has cache-memory path mismatch`; |
🧪 Test Quality Sentinel Report✅ Test Quality Score: 100/100 — Excellent
📊 Metrics & Test Classification (3 tests analyzed)
Test Classification Details
Language SupportTests analyzed:
No Go test files were modified in this PR. Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators. References: §27350333029
|
There was a problem hiding this comment.
REQUEST_CHANGES — one blocking correctness bug, one maintainability gap
🔴 Blocking: cascade detection broken for all new precise titles
FAILURE_TITLE_PATTERN (/^\[aw\] .+ failed$/) and the findRecentFailureIssues search query ("failed" in:title) were not updated to match the 13 titles now produced by buildFailureIssueTitle. 11 of those 13 titles do not contain "failed", and the two that do (failed to mint GitHub App token, failed lockdown check) end mid-sentence so they also fail the $ anchor.
Result: a cascade of timed out, produced no safe outputs, exceeded budget, etc. issues will never be counted by cascade detection, silently suppressing the rollup that exists specifically to catch systemic failures.
Fix: relax FAILURE_TITLE_PATTERN to /^\[aw\] .+/ and drop the "failed" keyword from the cascade search query. See inline comment.
🟡 Maintainability: `buildFailureIssueTitle` not exported; 10 of 13 branches uncovered
buildFailureMatchCategories (the direct sibling) is exported and unit-tested. buildFailureIssueTitle is not exported, so its branches can only be exercised through the heavyweight main() integration path. The three new tests cover only isTimedOut, hasMissingSafeOutputs, and hasReportIncomplete; the highest-priority branches (hasDailyAICExceeded, maxAICreditsExceeded, ...) have zero coverage for the actual title strings they emit. A single typo in those strings goes undetected. See inline comment.
🔎 Code quality review by PR Code Quality Reviewer · ⌖ 13.4 AIC
| if (options.hasMissingSafeOutputs) return `[aw] ${workflowName} produced no safe outputs`; | ||
| if (options.hasMissingTool) return `[aw] ${workflowName} is missing required tool`; | ||
| if (options.hasMissingData) return `[aw] ${workflowName} is missing required data`; | ||
| return `[aw] ${workflowName} failed`; |
There was a problem hiding this comment.
Cascade detection is now blind to 11 of 13 new titles: FAILURE_TITLE_PATTERN at line 2072 (/^\[aw\] .+ failed$/) matches only the generic fallback, and the cascade search query at line 2124 hardcodes "failed" in:title. Every new precise title except the two failed * ones will be silently skipped by findRecentFailureIssues, so a flood of timed out, produced no safe outputs, exceeded budget, etc. issues will never trigger cascade rollup.
💡 Suggested fix
Update the constant and its comment (line 2072):
/** Matches all individual failure issue titles produced by handle_agent_failure */
const FAILURE_TITLE_PATTERN = /^\[aw\] .+/;And the search query at line 2124 — drop the "failed" keyword so new-style titles are returned by the GitHub search API:
const searchQuery = `repo:${owner}/${repo} is:issue is:open label:agentic-workflows "[aw]" in:title created:>=${since}`;Of the 13 possible titles this function returns, 11 do not contain the word "failed" at all, and the two that do (failed to mint ... / failed lockdown check) end mid-title so they also fail the $ anchor. Both the search filter and the regex need to widen.
| * @param {boolean} options.maxAICreditsExceeded | ||
| * @returns {string} | ||
| */ | ||
| function buildFailureIssueTitle(options) { |
There was a problem hiding this comment.
buildFailureIssueTitle is not exported, leaving 10 of 13 title branches untested at the unit level: its parallel function buildFailureMatchCategories is exported and can be exercised directly; buildFailureIssueTitle cannot. The three added integration tests cover only isTimedOut, hasMissingSafeOutputs, and hasReportIncomplete. The highest-priority branches — hasDailyAICExceeded, maxAICreditsExceeded, aiCreditsRateLimitError, hasAppTokenMintingFailed, hasLockdownCheckFailed, hasStaleLockFileFailed, hasToolDenialsExceeded, hasCacheMissMisconfiguration, hasMissingTool, hasMissingData — have no coverage for their title strings.
💡 Suggested fix
Add buildFailureIssueTitle to module.exports alongside buildFailureMatchCategories:
module.exports = {
// ...
buildFailureMatchCategories,
buildFailureIssueTitle, // add this
FAILURE_CATEGORIES_PATH,
};Then add lightweight unit tests that call buildFailureIssueTitle directly with each flag set in isolation, similar to how buildFailureMatchCategories is tested, so a typo in any branch title is caught immediately instead of at runtime.
|
@copilot run pr-finisher skill |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
This PR makes failure issues from the conclusion job more specific by deriving titles from the detected failure mode instead of always emitting
[aw] <workflow> failed. It improves triage signal and issue scanability without changing existing fallback behavior.What changed
buildFailureIssueTitle(...)inactions/setup/js/handle_agent_failure.cjs.Title mapping added
[aw] <workflow> timed out[aw] <workflow> produced no safe outputs[aw] <workflow> reported incomplete resultTest updates
actions/setup/js/handle_agent_failure.test.cjswith focused assertions for: