fix(smoke-claude): raise turn budget to 8 and fix add_comment usage by lpcox · Pull Request #5328 · github/gh-aw-firewall

lpcox · 2026-06-20T15:59:30Z

Problem

The Smoke Claude CI job has been failing with:

API Error: 403 Maximum LLM invocations exceeded (5 / 5)

The workflow was capped at max-turns: 5, which gh-aw expands into three coupled limits: --max-turns 5 (Claude harness), GH_AW_MAX_TURNS: 5 (env), and the api-proxy maxRuns: 5 hard cap. The agent wasted one of those five invocations on an empty-args add_comment schema probe — it ran safeoutputs add_comment . with the JSON piped via stdin, which yields args: {} and is rejected by the MCP server (-32602 Empty arguments are not allowed). Combined with other minor waste, the agent hit the cumulative cap before ever landing a successful add_comment. Harness retries can't recover because maxRuns is cumulative per run.

Fix

Two-pronged:

More headroom — bump max-turns 5 → 8.
Correct invocation guidance — add a prompt bullet telling the agent to pass add_comment arguments inline as a single JSON object (e.g. safeoutputs add_comment '{"item_number": <pr_number>, "body": "<markdown>"}') and never pipe JSON via stdin or pass ./a placeholder (which sends empty args and wastes a turn).

Changes

.github/workflows/smoke-claude.md — max-turns: 8 + new prompt bullet.
.github/workflows/smoke-claude.lock.yml — recompiled with gh aw compile (v0.80.6) and re-run through scripts/ci/postprocess-smoke-workflows.ts. Net change: maxRuns 5→8, --max-turns 5→8, GH_AW_MAX_TURNS 5→8. Prompt body is runtime-imported from the .md, so the new guidance takes effect via the .md edit.
scripts/ci/smoke-claude-workflow.test.ts — updated assertions to max-turns: 8 / --max-turns 8 and added an assertion for the new guidance.

Note: the separate Security Guard failure observed on the same PR is an unrelated Copilot auth-403 (credential/entitlement) issue and is out of scope here.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

The Smoke Claude job was failing with "403 Maximum LLM invocations exceeded (5 / 5)". The agent burned one of only five api-proxy runs on an empty-args `add_comment` schema probe (`safeoutputs add_comment .` with JSON piped via stdin yields `args: {}`, rejected by MCP), and the cumulative cap left no room to retry. Two-pronged fix: - Raise `max-turns` 5 -> 8 for headroom (expands to --max-turns, GH_AW_MAX_TURNS, and apiProxy maxRuns). - Add explicit prompt guidance to pass `add_comment` arguments inline as a single JSON object and never pipe JSON via stdin or pass `.`. Recompiled the lock and re-ran the smoke post-processing script. Updated smoke-claude-workflow.test.ts assertions accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-20T16:00:38Z

✅ Coverage Check Passed

Overall Coverage

Metric	Base	PR	Delta
Lines	97.62%	97.66%	📈 +0.04%
Statements	97.57%	97.61%	📈 +0.04%
Functions	98.85%	98.85%	➡️ +0.00%
Branches	93.22%	93.25%	📈 +0.03%

📁 Per-file Coverage Changes (1 files)

File	Lines (Before → After)	Statements (Before → After)
`src/workdir-setup.ts`	92.7% → 94.5% (+1.82%)	92.7% → 94.5% (+1.82%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot

Pull request overview

This PR adjusts the Smoke Claude workflow configuration to reduce CI failures caused by exhausting the LLM invocation/turn budget before completing required safeoutputs actions.

Changes:

Increased the configured turn budget from 5 → 8 in the source workflow and corresponding compiled lock workflow.
Added prompt guidance to avoid empty-args safeoutputs add_comment “schema probe” invocations by passing arguments inline.
Updated CI tests to assert the new turn budget and the presence of the new prompt guidance.

Show a summary per file

File	Description
`scripts/ci/smoke-claude-workflow.test.ts`	Updates assertions for `max-turns: 8` / `--max-turns 8` and checks the new inline-JSON guidance text.
`.github/workflows/smoke-claude.md`	Raises `max-turns` and adds guidance for correct `safeoutputs add_comment` invocation.
`.github/workflows/smoke-claude.lock.yml`	Regenerated lock workflow reflecting `maxRuns: 8`, `--max-turns 8`, and `GH_AW_MAX_TURNS: 8`.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 3/3 changed files
Comments generated: 2

 The JSON contains: `result` (PASS/FAIL), `api_status`, `gh_check`, `file_status`, `event`, `pr_number`.

 - If `event` is `pull_request`: call `add_comment` with `item_number` set to `pr_number` and a body listing each check result plus the overall `result`; then call `add_labels` with `["smoke-claude"]` only if `result` is `PASS`.
+- Pass arguments inline as a single JSON object, e.g. `safeoutputs add_comment '{"item_number": <pr_number>, "body": "<markdown>"}'`. Do NOT pipe JSON via stdin and do NOT pass `.` (or any placeholder) as the argument — that sends empty arguments and the call is rejected as a schema probe, wasting a turn.


lpcox · 2026-06-20T16:03:49Z

@copilot address review feedback

github-actions · 2026-06-20T16:04:10Z

✅ Copilot review passed with no inline comments.

@lpcox Add the ready-for-aw label to this PR to trigger agentic CI smoke tests.

Copilot · 2026-06-20T16:07:03Z

@copilot address review feedback

Addressed in 44dcf79. I replaced the invalid inline JSON example in smoke-claude.md with a copy-pasteable JSON object and renamed the stale Jest test description so it matches max-turns: 8.

github-actions · 2026-06-20T16:12:12Z

✅ Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓

github-actions · 2026-06-20T16:12:13Z

🚀 Security Guard has started processing this pull request

github-actions · 2026-06-20T16:12:13Z

✅ Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓

github-actions · 2026-06-20T16:12:13Z

🔌 Smoke Services — All services reachable! ✅

github-actions · 2026-06-20T16:12:14Z

✅ Smoke Claude passed

github-actions · 2026-06-20T16:12:15Z

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

github-actions · 2026-06-20T16:12:15Z

✅ Smoke Gemini completed. All facets verified. 💎

Smoke test completed with FAIL status. Connectivity issues and missing MCP tools reported.

github-actions · 2026-06-20T16:12:16Z

📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅

github-actions · 2026-06-20T16:12:17Z

Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded.

github-actions · 2026-06-20T16:12:18Z

✅ Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓

github-actions · 2026-06-20T16:12:20Z

✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟

github-actions · 2026-06-20T16:12:27Z

🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅

github-actions · 2026-06-20T16:12:27Z

✅ Build Test Suite completed successfully!

github-actions · 2026-06-20T16:15:19Z

🤖 Smoke Test Results

PR: fix(smoke-claude): raise turn budget to 8 and fix add_comment usage — @lpcox

Test	Result
GitHub MCP	✅ Connected (PR listed)
HTTP Connectivity	⚠️ Pre-step data unavailable
File Write/Read	⚠️ Pre-step data unavailable

Overall: FAIL — pre-step template variables were not substituted; tests 2 & 3 unverifiable.

📰 BREAKING: Report filed by Smoke Copilot

github-actions · 2026-06-20T16:15:26Z

@lpcox

Merged PR list validation: ✅
GitHub.com Connectivity: ✅
File Write/Read Test: ✅
BYOK Inference Test: ✅

Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw)

Overall: PASS

🔑 BYOK (AOAI api-key) report filed by Smoke Copilot BYOK AOAI (api-key)

github-actions · 2026-06-20T16:15:43Z

Smoke Test: Claude Engine Validation

Check	Result
API status	✅ PASS
gh check	✅ PASS
File status	✅ PASS

Overall result: PASS

Generated by Smoke Claude for issue #5328 · 60.9 AIC · ⊞ 3.1K · ◷

github-actions · 2026-06-20T16:15:46Z

Smoke Test: Copilot BYOK (Direct Mode) ✅ PASS

Tests:

✅ GitHub MCP connectivity (2 merged PRs fetched successfully)
✅ GitHub.com reachability (HTTP 200)
✅ File write/read verification
✅ BYOK inference path active (direct mode via api-proxy → api.githubcopilot.com)

Status: Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY) with api-proxy sidecar authentication.

cc @lpcox

🔑 BYOK report filed by Smoke Copilot BYOK

github-actions · 2026-06-20T16:16:09Z

@lpcox
✅ GitHub MCP testing
✅ GitHub.com connectivity
✅ File write/read test
✅ BYOK inference test
Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra
PASS

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)

github-actions · 2026-06-20T16:16:22Z

Merged PRs reviewed: [WIP] Refactor extract functions in token-parsers file, [WIP] Refactor to extract rule-setup phases into named bash functions

✅ GitHub page title contains "GitHub"
✅ Smoke test file created and verified
✅ Discussion lookup returned #5327
✅ npm ci && npm run build

Overall status: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

github-actions · 2026-06-20T16:16:41Z

Chroot Version Comparison Results

Runtime	Host Version	Chroot Version	Match?
Python	Python 3.12.13	Python 3.12.3	❌
Node.js	v24.16.0	v22.22.3	❌
Go	go1.22.12	go1.22.12	✅

Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.

Tested by Smoke Chroot

github-actions · 2026-06-20T16:16:42Z

Smoke Test: API Proxy OpenTelemetry Tracing

Scenario	Result	Notes
1. Module Loading	✅	`otel.js` loads cleanly; exports: `startRequestSpan`, `setTokenAttributes`, `setBudgetAttributes`, `endSpan`, `endSpanError`, `shutdown`, `isEnabled` + internal test helpers
2. Test Suite	✅	59 tests passed, 0 failed across `otel.test.js` + `otel-fanout.test.js`
3. Env Var Forwarding	✅	`api-proxy-service-config.ts` forwards `GH_AW_OTLP_ENDPOINTS`, `OTEL_EXPORTER_OTLP_ENDPOINT`, `OTEL_EXPORTER_OTLP_HEADERS`, `GITHUB_AW_OTEL_TRACE_ID`, `GITHUB_AW_OTEL_PARENT_SPAN_ID`
4. Token Tracker Integration	✅	`onUsage` callback exists in `token-tracker-http.js`; invoked as `onUsage(normalized, model)` after usage extraction
5. OTEL Diagnostics	✅	No OTLP endpoint configured in this run → graceful fallback to `FileSpanExporter` writing `otel.jsonl`; no errors

All 5 scenarios pass. OTEL integration is functioning correctly.

📡 OTel tracing validated by Smoke OTel Tracing

github-actions · 2026-06-20T16:16:43Z

Smoke Test — Copilot PAT Auth

Test	Result
GitHub MCP connectivity	✅
GitHub.com HTTP status	✅ 200
File write/read	❌ template vars not substituted

PR: "fix(smoke-claude): raise turn budget to 8 and fix add_comment usage"
Author: @lpcox | Auth mode: PAT (COPILOT_GITHUB_TOKEN)
Overall: FAIL — file test data unavailable

🔑 PAT report filed by Smoke Copilot PAT

github-actions · 2026-06-20T16:18:15Z

Smoke Test Results — FAIL ❌

Check	Result
Redis PING	❌ timeout — no PONG
PostgreSQL `pg_isready`	❌ `no response`
PostgreSQL `SELECT 1`	❌ not reached

host.docker.internal resolves to 172.17.0.1 but ports 6379 and 5432 are unreachable. Service containers may not be running in this workflow.

🔌 Service connectivity validated by Smoke Services

github-actions · 2026-06-20T16:18:46Z

🏗️ Build Test Suite Results

Ecosystem	Project	Build/Install	Tests	Status
Bun	elysia	✅	1/1 passed	✅ PASS
Bun	hono	✅	1/1 passed	✅ PASS
C++	fmt	✅	N/A	✅ PASS
C++	json	✅	N/A	✅ PASS
Deno	oak	N/A	1/1 passed	✅ PASS
Deno	std	N/A	1/1 passed	✅ PASS
.NET	hello-world	✅	N/A	✅ PASS
.NET	json-parse	✅	N/A	✅ PASS
Go	color	✅	1/1 passed	✅ PASS
Go	env	✅	1/1 passed	✅ PASS
Go	uuid	✅	1/1 passed	✅ PASS
Java	gson	✅	1/1 passed	✅ PASS
Java	caffeine	✅	1/1 passed	✅ PASS
Node.js	clsx	✅	All passed	✅ PASS
Node.js	execa	✅	All passed	✅ PASS
Node.js	p-limit	✅	All passed	✅ PASS
Rust	fd	✅	1/1 passed	✅ PASS
Rust	zoxide	✅	1/1 passed	✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #5328 · 47.9 AIC · ⊞ 7.7K · ◷

github-actions · 2026-06-20T16:27:26Z

Smoke Test: Gemini Engine Validation

GitHub MCP Testing: ❌ (Tools not available in context)
GitHub.com Connectivity: ❌ (SSL connect error 35)
File Writing Testing: ✅
Bash Tool Testing: ✅

Overall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

Copilot AI review requested due to automatic review settings June 20, 2026 15:59

Copilot started reviewing on behalf of lpcox June 20, 2026 16:00 View session

Copilot AI reviewed Jun 20, 2026

View reviewed changes

Copilot started work on behalf of lpcox June 20, 2026 16:04 View session

fix: address smoke claude review feedback

44dcf79

Copilot finished work on behalf of lpcox June 20, 2026 16:07

lpcox added the ready-for-aw label Jun 20, 2026

lpcox temporarily deployed to aoai-model June 20, 2026 16:12 — with GitHub Actions Inactive

github-actions Bot added the smoke-copilot-byok-aoai-apikey label Jun 20, 2026

github-actions Bot added the smoke-claude label Jun 20, 2026

github-actions Bot added the smoke-copilot-byok label Jun 20, 2026

lpcox temporarily deployed to aoai-model June 20, 2026 16:15 — with GitHub Actions Inactive

github-actions Bot added the smoke-copilot-byok-aoai-entra label Jun 20, 2026

lpcox temporarily deployed to aoai-model June 20, 2026 16:16 — with GitHub Actions Inactive

github-actions Bot added the smoke-codex label Jun 20, 2026

github-actions Bot added the build-test label Jun 20, 2026

lpcox merged commit 10d3a2e into main Jun 20, 2026
91 of 94 checks passed

lpcox deleted the fix/smoke-claude-turn-budget branch June 20, 2026 16:52

This was referenced Jun 20, 2026

perf(security-guard): prioritize security-relevant files in PR diff #5329

Merged

Fix gVisor workflow: Add proper health checks for Squid and Envoy #5237

Merged

[Test Coverage] deduplicate docker-manager.ts re-export tests #5341

Merged

Conversation

lpcox commented Jun 20, 2026

Problem

Fix

Changes

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Coverage Check Passed

Overall Coverage

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

lpcox commented Jun 20, 2026

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI commented Jun 20, 2026

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 20, 2026

🤖 Smoke Test Results

Uh oh!

github-actions Bot commented Jun 20, 2026

Uh oh!

github-actions Bot commented Jun 20, 2026

Smoke Test: Claude Engine Validation

Uh oh!

github-actions Bot commented Jun 20, 2026

Smoke Test: Copilot BYOK (Direct Mode) ✅ PASS

Uh oh!

github-actions Bot commented Jun 20, 2026

Uh oh!

github-actions Bot commented Jun 20, 2026

Uh oh!

github-actions Bot commented Jun 20, 2026

Chroot Version Comparison Results

Uh oh!

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading

github-actions Bot commented Jun 20, 2026 •

edited

Loading