Use failure-specific issue titles in conclusion job by Copilot · Pull Request #38615 · github/gh-aw

Copilot · 2026-06-11T13:26:10Z

This PR makes failure issues from the conclusion job more specific by deriving titles from the detected failure mode instead of always emitting [aw] <workflow> failed. It improves triage signal and issue scanability without changing existing fallback behavior.

What changed
- Added buildFailureIssueTitle(...) in actions/setup/js/handle_agent_failure.cjs.
- Replaced static title generation with failure-aware title selection.
- Kept existing generic fallback when no specific condition matches.
Title mapping added
- Timeout → [aw] <workflow> timed out
- Missing safe outputs → [aw] <workflow> produced no safe outputs
- Report incomplete → [aw] <workflow> reported incomplete result
- Plus targeted variants for budget/rate-limit and selected infra/auth failure signals, with priority ordering.
Test updates
- Extended actions/setup/js/handle_agent_failure.test.cjs with focused assertions for:
  - timeout title
  - missing safe outputs title
  - report_incomplete title
- Tightened env var cleanup in new test paths to avoid cross-test leakage.

const issueTitle = buildFailureIssueTitle({
  workflowName: sanitizedWorkflowName,
  isTimedOut,
  hasMissingSafeOutputs,
  hasReportIncomplete,
  hasMissingTool,
  hasMissingData,
  hasCacheMissMisconfiguration,
  hasToolDenialsExceeded,
  hasAppTokenMintingFailed,
  hasLockdownCheckFailed,
  hasStaleLockFileFailed,
  hasDailyAICExceeded,
  aiCreditsRateLimitError,
  maxAICreditsExceeded,
});

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions · 2026-06-11T13:33:08Z

🧠 Matt Pocock Skills Reviewer was skipped during the skills-based review.

github-actions · 2026-06-11T13:33:10Z

✅ Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #38615 does not have the implementation label and has 0 new lines of code in business logic directories (threshold 100).

github-actions · 2026-06-11T13:33:21Z

🧪 Test Quality Sentinel completed test quality analysis.

github-actions · 2026-06-11T13:33:23Z

✅ PR Code Quality Reviewer completed the code quality review.

Copilot

Pull request overview

This PR updates the conclusion-job failure issue creation flow to emit failure-specific GitHub issue titles (derived from detected failure signals) instead of always using a generic "[aw] <workflow> failed" title, improving triage and scanability.

Changes:

Added buildFailureIssueTitle(...) to derive a more specific issue title based on prioritized failure signals (timeout, missing safe outputs, report incomplete, budget/rate-limit, etc.).
Switched main() to use buildFailureIssueTitle(...) when creating/updating failure issues.
Expanded handle_agent_failure.test.cjs with assertions for the new timeout / missing-safe-outputs / report-incomplete title variants, plus tighter env cleanup.

Show a summary per file

File	Description
actions/setup/js/handle_agent_failure.cjs	Adds failure-aware issue title builder and wires it into the failure-issue creation path.
actions/setup/js/handle_agent_failure.test.cjs	Adds focused tests validating the new title selection behavior and improves env cleanup.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 2/2 changed files
Comments generated: 2

+  if (options.isTimedOut) return `[aw] ${workflowName} timed out`;
+  if (options.hasToolDenialsExceeded) return `[aw] ${workflowName} exceeded tool denial limit`;
+  if (options.hasCacheMissMisconfiguration) return `[aw] ${workflowName} has cache-memory path mismatch`;
+  if (options.hasReportIncomplete) return `[aw] ${workflowName} reported incomplete result`;
+  if (options.hasMissingSafeOutputs) return `[aw] ${workflowName} produced no safe outputs`;


+  if (options.hasStaleLockFileFailed) return `[aw] ${workflowName} has stale lock file`;
+  if (options.isTimedOut) return `[aw] ${workflowName} timed out`;
+  if (options.hasToolDenialsExceeded) return `[aw] ${workflowName} exceeded tool denial limit`;
+  if (options.hasCacheMissMisconfiguration) return `[aw] ${workflowName} has cache-memory path mismatch`;


github-actions · 2026-06-11T13:40:54Z

🧪 Test Quality Sentinel Report

✅ Test Quality Score: 100/100 — Excellent

Analyzed 3 test(s): 3 design, 0 implementation, 0 guideline violation(s).

📊 Metrics & Test Classification (3 tests analyzed)

Metric	Value
New/modified tests analyzed	3
✅ Design tests (behavioral contracts)	3 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	3 (100%)
Duplicate test clusters	0
Test inflation detected	No (ratio 1.78 — under 2:1 threshold)
🚨 Coding-guideline violations	0

Test Classification Details

Test	File	Classification	Issues Detected
`uses a precise timeout title when the agent times out`	`actions/setup/js/handle_agent_failure.test.cjs:754`	✅ Design	None
`uses a precise missing safe outputs title`	`actions/setup/js/handle_agent_failure.test.cjs:787`	✅ Design	None
`uses a precise report incomplete title`	`actions/setup/js/handle_agent_failure.test.cjs:818`	✅ Design	None

Language Support

Tests analyzed:

🟨 JavaScript (*.test.cjs): 3 tests (vitest)

No Go test files were modified in this PR.

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 3 new tests verify observable behavioral contracts — specifically the precise issue title strings produced for distinct failure classes (timed_out, missing safe outputs, report_incomplete). GitHub API calls are mocked via vi.fn() (external I/O — legitimate). Test inflation ratio is 1.78 (under the 2:1 threshold).

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §27350333029

🧪 Test quality analysis by Test Quality Sentinel · 244.9 AIC · ⌖ 19.3 AIC · ◷

github-actions

✅ Test Quality Sentinel: 100/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 3 new tests verify behavioral contracts for the precise failure-title selection logic.

github-actions

REQUEST_CHANGES — one blocking correctness bug, one maintainability gap

🔴 Blocking: cascade detection broken for all new precise titles

FAILURE_TITLE_PATTERN (/^\[aw\] .+ failed$/) and the findRecentFailureIssues search query ("failed" in:title) were not updated to match the 13 titles now produced by buildFailureIssueTitle. 11 of those 13 titles do not contain "failed", and the two that do (failed to mint GitHub App token, failed lockdown check) end mid-sentence so they also fail the $ anchor.

Result: a cascade of timed out, produced no safe outputs, exceeded budget, etc. issues will never be counted by cascade detection, silently suppressing the rollup that exists specifically to catch systemic failures.

Fix: relax FAILURE_TITLE_PATTERN to /^\[aw\] .+/ and drop the "failed" keyword from the cascade search query. See inline comment.

🟡 Maintainability: `buildFailureIssueTitle` not exported; 10 of 13 branches uncovered

buildFailureMatchCategories (the direct sibling) is exported and unit-tested. buildFailureIssueTitle is not exported, so its branches can only be exercised through the heavyweight main() integration path. The three new tests cover only isTimedOut, hasMissingSafeOutputs, and hasReportIncomplete; the highest-priority branches (hasDailyAICExceeded, maxAICreditsExceeded, ...) have zero coverage for the actual title strings they emit. A single typo in those strings goes undetected. See inline comment.

🔎 Code quality review by PR Code Quality Reviewer · ⌖ 13.4 AIC

github-actions · 2026-06-11T13:44:59Z

+  if (options.hasMissingSafeOutputs) return `[aw] ${workflowName} produced no safe outputs`;
+  if (options.hasMissingTool) return `[aw] ${workflowName} is missing required tool`;
+  if (options.hasMissingData) return `[aw] ${workflowName} is missing required data`;
+  return `[aw] ${workflowName} failed`;


Cascade detection is now blind to 11 of 13 new titles: FAILURE_TITLE_PATTERN at line 2072 (/^\[aw\] .+ failed$/) matches only the generic fallback, and the cascade search query at line 2124 hardcodes "failed" in:title. Every new precise title except the two failed * ones will be silently skipped by findRecentFailureIssues, so a flood of timed out, produced no safe outputs, exceeded budget, etc. issues will never trigger cascade rollup.

💡 Suggested fix

Update the constant and its comment (line 2072):

/** Matches all individual failure issue titles produced by handle_agent_failure */ const FAILURE_TITLE_PATTERN = /^\[aw\] .+/;

And the search query at line 2124 — drop the "failed" keyword so new-style titles are returned by the GitHub search API:

const searchQuery = `repo:${owner}/${repo} is:issue is:open label:agentic-workflows "[aw]" in:title created:>=${since}`;

Of the 13 possible titles this function returns, 11 do not contain the word "failed" at all, and the two that do (failed to mint ... / failed lockdown check) end mid-title so they also fail the $ anchor. Both the search filter and the regex need to widen.

github-actions · 2026-06-11T13:44:59Z

+ * @param {boolean} options.maxAICreditsExceeded
+ * @returns {string}
+ */
+function buildFailureIssueTitle(options) {


buildFailureIssueTitle is not exported, leaving 10 of 13 title branches untested at the unit level: its parallel function buildFailureMatchCategories is exported and can be exercised directly; buildFailureIssueTitle cannot. The three added integration tests cover only isTimedOut, hasMissingSafeOutputs, and hasReportIncomplete. The highest-priority branches — hasDailyAICExceeded, maxAICreditsExceeded, aiCreditsRateLimitError, hasAppTokenMintingFailed, hasLockdownCheckFailed, hasStaleLockFileFailed, hasToolDenialsExceeded, hasCacheMissMisconfiguration, hasMissingTool, hasMissingData — have no coverage for their title strings.

💡 Suggested fix

Add buildFailureIssueTitle to module.exports alongside buildFailureMatchCategories:

module.exports = { // ... buildFailureMatchCategories, buildFailureIssueTitle, // add this FAILURE_CATEGORIES_PATH, };

Then add lightweight unit tests that call buildFailureIssueTitle directly with each flag set in isolation, similar to how buildFailureMatchCategories is tested, so a typo in any branch title is caught immediately instead of at runtime.

pelikhan · 2026-06-11T13:48:22Z

@copilot run pr-finisher skill

Done

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-06-11T14:28:19Z

@copilot run pr-finisher skill

Completed. I ran the PR-finisher flow, addressed in-scope review feedback, and pushed commit f223fd8 with the fixes.

Copilot AI and others added 3 commits June 11, 2026 13:10

Add precise failure issue title selection

1d17ffe

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Stabilize precise failure title tests

7e80ce1

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Harden failure title test env cleanup

e46b2a5

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI assigned Copilot and pelikhan Jun 11, 2026

Copilot AI changed the title ~~Customize agent failure issue titles in conclusion handler~~ Use failure-specific issue titles in conclusion job Jun 11, 2026

Copilot finished work on behalf of pelikhan June 11, 2026 13:26

Copilot created this pull request from a session on behalf of pelikhan June 11, 2026 13:26 View session

Copilot AI requested a review from pelikhan June 11, 2026 13:26

pelikhan marked this pull request as ready for review June 11, 2026 13:29

Copilot AI review requested due to automatic review settings June 11, 2026 13:29

Copilot started reviewing on behalf of pelikhan June 11, 2026 13:30 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 11, 2026

[aw] No-Op Runs #38496

Closed

github-actions Bot approved these changes Jun 11, 2026

View reviewed changes

github-actions Bot previously requested changes Jun 11, 2026

View reviewed changes

Copilot started work on behalf of pelikhan June 11, 2026 13:48 View session

github-actions Bot mentioned this pull request Jun 11, 2026

[PR Triage Report] PR Triage Report — 2026-06-11 (Run #27350912646) #38619

Closed

github-actions Bot mentioned this pull request Jun 11, 2026

[Contribution Check Report] Contribution Check — 2026-06-11 #38478

Closed

Fix cascade detection for precise failure titles

f223fd8

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Merge branch 'main' into copilot/customize-agent-failure-titles

633c002

Copilot finished work on behalf of pelikhan June 11, 2026 14:29

pelikhan merged commit 92bb513 into main Jun 11, 2026
13 checks passed

pelikhan deleted the copilot/customize-agent-failure-titles branch June 11, 2026 14:31

github-actions Bot mentioned this pull request Jun 11, 2026

[PR Triage Report] PR Triage Report — 2026-06-11 (Run §27371175049) #38709

Closed

Conversation

Copilot AI commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented Jun 11, 2026

🧪 Test Quality Sentinel Report

Test Classification Details

Language Support

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

REQUEST_CHANGES — one blocking correctness bug, one maintainability gap

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

pelikhan commented Jun 11, 2026

Uh oh!

Copilot AI commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading