Skip to content

[copilot-opt] Diagnose PR Sous Chef failure root cause to prevent a third scope-drifted agent attempt #39443

Description

@github-actions

Problem

Two Copilot PRs for the aw-fix-pr-sous-chef-failure topic have been closed without merging, with each attempt targeting a completely different root cause. The first addressed empty-queue noop handling; the second addressed CLI proxy networking. The diverging implementations indicate the agent is re-diagnosing the problem from scratch on each retry rather than converging on a verified root cause, producing a high-risk retry pattern.

Evidence

  • Analysis window: 2026-06-01 to 2026-06-15
  • Sessions analyzed: 50
  • Key metrics and examples:
    • PR #37571 (closed 2026-06-07, 30.1 AIC consumed): "Seed noop safe output when PR Sous Chef queue is empty" — addressed the empty-queue path; reviewer requested specific compiler if-condition changes; multiple rebase/recompile cycles; PR closed without final resolution
    • PR #37972 (closed 2026-06-09): "Allow AWF CLI proxy traffic to host DIFC port in gh-proxy workflows" — targeted CLI proxy networking; a completely different root cause from PR #37571; closed with only an @mention comment and no resolution explanation
    • Across 2 failed attempts, the agent produced implementations for 2 unrelated sub-systems (noop emission vs. proxy port configuration), confirming scope drift between sessions
    • High-risk retry status: any third attempt is likely to drift to a third unrelated sub-system without a documented root cause

Proposed Change

  1. A human maintainer must inspect the actual PR Sous Chef failure logs and document the definitive root cause in a comment on this issue — specifically: is the failure the empty-queue noop path, a proxy networking gap, or something else?
  2. Provide a minimal reproduction case or a specific failing log excerpt so the next agent attempt can verify its fix resolves the actual failure.
  3. Close any remaining confusion between the two prior approaches by explicitly stating which (if either) was on the right track.

Expected Impact

  • Prevent a third scope-misaligned PR that addresses a different sub-problem and gets closed for a new reason
  • Save agent sessions and AIC budget that would otherwise be consumed on another mis-targeted attempt
  • Establish a single verifiable success criterion so the implementing PR can be reviewed and merged with confidence

Notes

  • Distinct root cause category: Absent root-cause documentation leading to scope drift (different diagnosis) across retry sessions
  • Data quality caveats: No events.jsonl files were available in the session logs directory; analysis is based on PR-level data only. Step-level timing and tool call data are unavailable.

Generated by ⚡ Copilot Opt · 401 AIC · ⌖ 24.2 AIC · ⊞ 19.4K ·

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions