Skip to content

[awf] agent: contribution-check exhausts token cap retrying unreachable cli-proxy (port 18443) #4374

Description

@lpcox

Problem

The contribution-check workflow has a 100% failure rate when Docker is unavailable. The in-sandbox awmg-cli-proxy at localhost:18443 is unreachable, so the agent retries get_diff/PR-metadata calls 29 times, inflating from a 47-turn baseline to 79 turns / 2.53M tokens / 16 min, hitting the 25M effective-token cap. AWF logs 54 firewall-blocked requests to host:18443 (unknown domain).

Context

Source: github/gh-aw#37081 — AgentRx daily optimizer identified run 26998856086 as the sole failure in a 15-run cohort. Root cause confirmed: Docker unavailable → cli-proxy down → unbounded retry loop.

Root Cause

No fail-fast on connection refused from the cli-proxy endpoint. The AWF firewall blocks port 18443 traffic (correct behaviour) but the agent interprets each blocked request as a transient error and retries, exhausting the token cap before emitting report_incomplete.

Proposed Solution

  1. In containers/agent/setup-iptables.sh, add an explicit REJECT --reject-with tcp-reset rule for localhost:18443 traffic so the agent gets an immediate non-retryable error rather than a timeout.
  2. Coordinate with gh-aw to pre-fetch PR diffs in the deterministic pre-step (runner-side with GH_TOKEN), removing the in-sandbox proxy dependency entirely.
  3. Document cli-proxy failure behaviour in docs/environment.md.

Generated by Firewall Issue Dispatcher · sonnet46 2.3M ·

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions