Raise the api-proxy maxRuns cap for Code Simplifier (or cut its sub-agent fan-out below 50), and add a classifier flag for the 50/50 invocation cap — it causes 100% of this workflow's failures and is currently bucketed as an unclassified exit-1.
Problem statement
Code Simplifier hits the api-proxy per-run LLM invocation-count cap (maxRuns: 50). Once 50 model invocations are consumed, every subsequent request returns CAPIError: 429 — Maximum LLM invocations exceeded (50 / 50). The Copilot harness retries 3× (all 429), then gives up and the Execute GitHub Copilot CLI step exits 1. This is distinct from AIC credit-budget exhaustion — AIC was only 650.154 / 1000 (Daily workflow AIC guardrail exceeded: false).
Affected workflows and run IDs
- Code Simplifier (
.github/workflows/code-simplifier.lock.yml) — 6/6 consecutive failures.
Probable root cause
The workflow fans out into many sub-agents (multiple read_agent / scout / scope-filter sub-agent calls observed), exhausting the api-proxy maxRuns: 50 invocation cap (defined in the awf-config apiProxy, separate from maxAiCredits: 1000). The retry loop cannot recover because the cap is per-run and already saturated. Compounding: the conclusion classifier has no flag for the 50/50 cap — GH_AW_AGENTIC_ENGINE_TIMEOUT, GH_AW_AI_CREDITS_RATE_LIMIT_ERROR, GH_AW_UNKNOWN_MODEL_AI_CREDITS, GH_AW_INFERENCE_ACCESS_ERROR, GH_AW_MODEL_NOT_SUPPORTED_ERROR are all false, so the failure is silently bucketed as a generic exit-1.
Proposed remediation
- Reduce fan-out OR raise the cap — either lower Code Simplifier's sub-agent count so a normal run stays under 50 invocations, or raise
maxRuns for this workflow to a level its legitimate fan-out needs.
- Add a classifier flag for
CAPIError 429 "Maximum LLM invocations exceeded" (e.g. GH_AW_INVOCATION_CAP_EXCEEDED) so these failures are categorized and distinguishable from AIC-budget and engine-timeout failures in future investigations.
Success criteria / verification
- Code Simplifier completes without
Maximum LLM invocations exceeded (50/50) for ≥3 consecutive scheduled runs.
- A 50/50 invocation-cap failure (if it recurs) is surfaced as a dedicated classified flag rather than a generic exit-1.
Parent: #29109. Analyzed runs: 27488668377, 27456907583.
Related to #29109
Generated by 🔍 [aw] Failure Investigator (6h) · 343.9 AIC · ⌖ 12.7 AIC · ⊞ 4.5K · ◷
Still 100% failing — raise maxRuns above 50 or cut Code Simplifier's tool-call volume; 8-day outage continues.
Fresh recurrence (6h failure sweep, 2026-06-16):
- Run §27594887412 —
agent job failed at "Execute GitHub Copilot CLI".
- Confirmed signature:
CAPIError: 429 Maximum LLM invocations exceeded (50 / 50) after retried 5 times (total retry wait 87.57s); awf-config shows apiProxy.maxRuns: 50.
- Outage span: failed every day 06-09 → 06-16 (8 consecutive scheduled runs, 100%).
Remediation unchanged: increase the api-proxy maxRuns cap for this workflow or reduce its per-run invocation count (fewer tool round-trips / tighter prompt).
Generated by 🔍 [aw] Failure Investigator (6h) · 263.8 AIC · ⌖ 12.1 AIC · ⊞ 4.5K · ◷
Raise the api-proxy
maxRunscap for Code Simplifier (or cut its sub-agent fan-out below 50), and add a classifier flag for the 50/50 invocation cap — it causes 100% of this workflow's failures and is currently bucketed as an unclassified exit-1.Problem statement
Code Simplifier hits the api-proxy per-run LLM invocation-count cap (
maxRuns: 50). Once 50 model invocations are consumed, every subsequent request returnsCAPIError: 429 — Maximum LLM invocations exceeded (50 / 50). The Copilot harness retries 3× (all 429), then gives up and the Execute GitHub Copilot CLI step exits 1. This is distinct from AIC credit-budget exhaustion — AIC was only 650.154 / 1000 (Daily workflow AIC guardrail exceeded: false).Affected workflows and run IDs
.github/workflows/code-simplifier.lock.yml) — 6/6 consecutive failures.Probable root cause
The workflow fans out into many sub-agents (multiple
read_agent/ scout / scope-filter sub-agent calls observed), exhausting the api-proxymaxRuns: 50invocation cap (defined in the awf-configapiProxy, separate frommaxAiCredits: 1000). The retry loop cannot recover because the cap is per-run and already saturated. Compounding: the conclusion classifier has no flag for the 50/50 cap —GH_AW_AGENTIC_ENGINE_TIMEOUT,GH_AW_AI_CREDITS_RATE_LIMIT_ERROR,GH_AW_UNKNOWN_MODEL_AI_CREDITS,GH_AW_INFERENCE_ACCESS_ERROR,GH_AW_MODEL_NOT_SUPPORTED_ERRORare allfalse, so the failure is silently bucketed as a generic exit-1.Proposed remediation
maxRunsfor this workflow to a level its legitimate fan-out needs.CAPIError 429 "Maximum LLM invocations exceeded"(e.g.GH_AW_INVOCATION_CAP_EXCEEDED) so these failures are categorized and distinguishable from AIC-budget and engine-timeout failures in future investigations.Success criteria / verification
Maximum LLM invocations exceeded (50/50)for ≥3 consecutive scheduled runs.Parent: #29109. Analyzed runs: 27488668377, 27456907583.
Related to #29109
Still 100% failing — raise
maxRunsabove 50 or cut Code Simplifier's tool-call volume; 8-day outage continues.Fresh recurrence (6h failure sweep, 2026-06-16):
agentjob failed at "Execute GitHub Copilot CLI".CAPIError: 429 Maximum LLM invocations exceeded (50 / 50)afterretried 5 times (total retry wait 87.57s); awf-config showsapiProxy.maxRuns: 50.Remediation unchanged: increase the api-proxy
maxRunscap for this workflow or reduce its per-run invocation count (fewer tool round-trips / tighter prompt).