Skip to content

fix(smoke-claude): raise turn budget to 8 and fix add_comment usage#5328

Merged
lpcox merged 2 commits into
mainfrom
fix/smoke-claude-turn-budget
Jun 20, 2026
Merged

fix(smoke-claude): raise turn budget to 8 and fix add_comment usage#5328
lpcox merged 2 commits into
mainfrom
fix/smoke-claude-turn-budget

Conversation

@lpcox

@lpcox lpcox commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Problem

The Smoke Claude CI job has been failing with:

API Error: 403 Maximum LLM invocations exceeded (5 / 5)

The workflow was capped at max-turns: 5, which gh-aw expands into three coupled limits: --max-turns 5 (Claude harness), GH_AW_MAX_TURNS: 5 (env), and the api-proxy maxRuns: 5 hard cap. The agent wasted one of those five invocations on an empty-args add_comment schema probe — it ran safeoutputs add_comment . with the JSON piped via stdin, which yields args: {} and is rejected by the MCP server (-32602 Empty arguments are not allowed). Combined with other minor waste, the agent hit the cumulative cap before ever landing a successful add_comment. Harness retries can't recover because maxRuns is cumulative per run.

Fix

Two-pronged:

  1. More headroom — bump max-turns 5 → 8.
  2. Correct invocation guidance — add a prompt bullet telling the agent to pass add_comment arguments inline as a single JSON object (e.g. safeoutputs add_comment '{"item_number": <pr_number>, "body": "<markdown>"}') and never pipe JSON via stdin or pass ./a placeholder (which sends empty args and wastes a turn).

Changes

  • .github/workflows/smoke-claude.mdmax-turns: 8 + new prompt bullet.
  • .github/workflows/smoke-claude.lock.yml — recompiled with gh aw compile (v0.80.6) and re-run through scripts/ci/postprocess-smoke-workflows.ts. Net change: maxRuns 5→8, --max-turns 5→8, GH_AW_MAX_TURNS 5→8. Prompt body is runtime-imported from the .md, so the new guidance takes effect via the .md edit.
  • scripts/ci/smoke-claude-workflow.test.ts — updated assertions to max-turns: 8 / --max-turns 8 and added an assertion for the new guidance.

Note: the separate Security Guard failure observed on the same PR is an unrelated Copilot auth-403 (credential/entitlement) issue and is out of scope here.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

The Smoke Claude job was failing with "403 Maximum LLM invocations
exceeded (5 / 5)". The agent burned one of only five api-proxy runs on
an empty-args `add_comment` schema probe (`safeoutputs add_comment .`
with JSON piped via stdin yields `args: {}`, rejected by MCP), and the
cumulative cap left no room to retry.

Two-pronged fix:
- Raise `max-turns` 5 -> 8 for headroom (expands to --max-turns,
  GH_AW_MAX_TURNS, and apiProxy maxRuns).
- Add explicit prompt guidance to pass `add_comment` arguments inline as
  a single JSON object and never pipe JSON via stdin or pass `.`.

Recompiled the lock and re-ran the smoke post-processing script. Updated
smoke-claude-workflow.test.ts assertions accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 20, 2026 15:59
@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 97.62% 97.66% 📈 +0.04%
Statements 97.57% 97.61% 📈 +0.04%
Functions 98.85% 98.85% ➡️ +0.00%
Branches 93.22% 93.25% 📈 +0.03%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/workdir-setup.ts 92.7% → 94.5% (+1.82%) 92.7% → 94.5% (+1.82%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the Smoke Claude workflow configuration to reduce CI failures caused by exhausting the LLM invocation/turn budget before completing required safeoutputs actions.

Changes:

  • Increased the configured turn budget from 5 → 8 in the source workflow and corresponding compiled lock workflow.
  • Added prompt guidance to avoid empty-args safeoutputs add_comment “schema probe” invocations by passing arguments inline.
  • Updated CI tests to assert the new turn budget and the presence of the new prompt guidance.
Show a summary per file
File Description
scripts/ci/smoke-claude-workflow.test.ts Updates assertions for max-turns: 8 / --max-turns 8 and checks the new inline-JSON guidance text.
.github/workflows/smoke-claude.md Raises max-turns and adds guidance for correct safeoutputs add_comment invocation.
.github/workflows/smoke-claude.lock.yml Regenerated lock workflow reflecting maxRuns: 8, --max-turns 8, and GH_AW_MAX_TURNS: 8.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 2

Comment thread .github/workflows/smoke-claude.md Outdated
The JSON contains: `result` (PASS/FAIL), `api_status`, `gh_check`, `file_status`, `event`, `pr_number`.

- If `event` is `pull_request`: call `add_comment` with `item_number` set to `pr_number` and a body listing each check result plus the overall `result`; then call `add_labels` with `["smoke-claude"]` only if `result` is `PASS`.
- Pass arguments inline as a single JSON object, e.g. `safeoutputs add_comment '{"item_number": <pr_number>, "body": "<markdown>"}'`. Do NOT pipe JSON via stdin and do NOT pass `.` (or any placeholder) as the argument — that sends empty arguments and the call is rejected as a schema probe, wasting a turn.
Comment thread scripts/ci/smoke-claude-workflow.test.ts Outdated
@lpcox

lpcox commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator Author

@copilot address review feedback

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

✅ Copilot review passed with no inline comments.

@lpcox Add the ready-for-aw label to this PR to trigger agentic CI smoke tests.

Copilot AI commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

@copilot address review feedback

Addressed in 44dcf79. I replaced the invalid inline JSON example in smoke-claude.md with a copy-pasteable JSON object and renamed the stale Jest test description so it matches max-turns: 8.

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓

@github-actions

Copy link
Copy Markdown
Contributor

🚀 Security Guard has started processing this pull request

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

🔌 Smoke Services — All services reachable! ✅

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Claude passed

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Gemini completed. All facets verified. 💎

Smoke test completed with FAIL status. Connectivity issues and missing MCP tools reported.

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded.

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Build Test Suite completed successfully!

@github-actions

Copy link
Copy Markdown
Contributor

🤖 Smoke Test Results

PR: fix(smoke-claude): raise turn budget to 8 and fix add_comment usage — @lpcox

Test Result
GitHub MCP ✅ Connected (PR listed)
HTTP Connectivity ⚠️ Pre-step data unavailable
File Write/Read ⚠️ Pre-step data unavailable

Overall: FAIL — pre-step template variables were not substituted; tests 2 & 3 unverifiable.

📰 BREAKING: Report filed by Smoke Copilot

@github-actions

Copy link
Copy Markdown
Contributor

@lpcox

  • Merged PR list validation: ✅
  • GitHub.com Connectivity: ✅
  • File Write/Read Test: ✅
  • BYOK Inference Test: ✅

Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw)

Overall: PASS

🔑 BYOK (AOAI api-key) report filed by Smoke Copilot BYOK AOAI (api-key)

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Claude Engine Validation

Check Result
API status ✅ PASS
gh check ✅ PASS
File status ✅ PASS

Overall result: PASS

Generated by Smoke Claude for issue #5328 · 60.9 AIC · ⊞ 3.1K ·

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Copilot BYOK (Direct Mode) ✅ PASS

Tests:

  • ✅ GitHub MCP connectivity (2 merged PRs fetched successfully)
  • ✅ GitHub.com reachability (HTTP 200)
  • ✅ File write/read verification
  • ✅ BYOK inference path active (direct mode via api-proxy → api.githubcopilot.com)

Status: Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY) with api-proxy sidecar authentication.

cc @lpcox

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions

Copy link
Copy Markdown
Contributor

@lpcox
✅ GitHub MCP testing
✅ GitHub.com connectivity
✅ File write/read test
✅ BYOK inference test
Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra
PASS

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)

@github-actions

Copy link
Copy Markdown
Contributor

Merged PRs reviewed: [WIP] Refactor extract functions in token-parsers file, [WIP] Refactor to extract rule-setup phases into named bash functions

✅ GitHub page title contains "GitHub"
✅ Smoke test file created and verified
✅ Discussion lookup returned #5327
npm ci && npm run build

Overall status: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions

Copy link
Copy Markdown
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3
Node.js v24.16.0 v22.22.3
Go go1.22.12 go1.22.12

Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.

Tested by Smoke Chroot

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: API Proxy OpenTelemetry Tracing

Scenario Result Notes
1. Module Loading otel.js loads cleanly; exports: startRequestSpan, setTokenAttributes, setBudgetAttributes, endSpan, endSpanError, shutdown, isEnabled + internal test helpers
2. Test Suite 59 tests passed, 0 failed across otel.test.js + otel-fanout.test.js
3. Env Var Forwarding api-proxy-service-config.ts forwards GH_AW_OTLP_ENDPOINTS, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID
4. Token Tracker Integration onUsage callback exists in token-tracker-http.js; invoked as onUsage(normalized, model) after usage extraction
5. OTEL Diagnostics No OTLP endpoint configured in this run → graceful fallback to FileSpanExporter writing otel.jsonl; no errors

All 5 scenarios pass. OTEL integration is functioning correctly.

📡 OTel tracing validated by Smoke OTel Tracing

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test — Copilot PAT Auth

Test Result
GitHub MCP connectivity
GitHub.com HTTP status ✅ 200
File write/read ❌ template vars not substituted

PR: "fix(smoke-claude): raise turn budget to 8 and fix add_comment usage"
Author: @lpcox | Auth mode: PAT (COPILOT_GITHUB_TOKEN)
Overall: FAIL — file test data unavailable

🔑 PAT report filed by Smoke Copilot PAT

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test Results — FAIL ❌

Check Result
Redis PING ❌ timeout — no PONG
PostgreSQL pg_isready no response
PostgreSQL SELECT 1 ❌ not reached

host.docker.internal resolves to 172.17.0.1 but ports 6379 and 5432 are unreachable. Service containers may not be running in this workflow.

🔌 Service connectivity validated by Smoke Services

@github-actions

Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #5328 · 47.9 AIC · ⊞ 7.7K ·

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Gemini Engine Validation

  • GitHub MCP Testing: ❌ (Tools not available in context)
  • GitHub.com Connectivity: ❌ (SSL connect error 35)
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅

Overall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

@lpcox lpcox merged commit 10d3a2e into main Jun 20, 2026
91 of 94 checks passed
@lpcox lpcox deleted the fix/smoke-claude-turn-budget branch June 20, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants