Skip to content

perf(security-guard): prioritize security-relevant files in PR diff#5329

Merged
lpcox merged 2 commits into
mainfrom
fix/security-guard-prompt-size
Jun 20, 2026
Merged

perf(security-guard): prioritize security-relevant files in PR diff#5329
lpcox merged 2 commits into
mainfrom
fix/security-guard-prompt-size

Conversation

@lpcox

@lpcox lpcox commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Context

While diagnosing the Security Guard failure on PR #5310 (Authentication failed with provider at http://172.30.0.30:10002 (HTTP 403)), I noticed the agent's first request was very large (151.2k tokens, 125.4k cached). The upstream-403 root cause was already addressed separately by #5306 (giving COPILOT_DUMMY_BYOK a valid ghu_… token format). This PR is the complementary prompt-size improvement (option 3 from the diagnosis): shrink the pre-fetched diff so big PRs don't send oversized prompts.

Problem

The Fetch PR changed files step dumped full patches for every changed file in the PR (up to 100 KB), then instructed the agent to re-fetch additional context when truncated. On large refactor PRs this bloats the prompt (cost, latency, and a bigger first request that is more likely to trip the upstream provider).

Change

The pre-fetch now:

  • Includes full patches only for security-relevant files (same security path pattern used by the relevance gate: host-iptables, setup-iptables, squid-config, docker-manager, seccomp-profile, domain-patterns, entrypoint.sh, Dockerfile, containers/), largest first.
  • Excludes test paths ((^|/)tests?/|\.test\.) from that security-relevant inline set, matching the relevance gate behavior.
  • Lists every other changed file by name only (additions/deletions counts), omitting their patches.
  • Updates the prompt + truncation note to clarify that mcp__github__get_pull_request_diff returns the full PR diff; if truncation hides a needed security patch, fetch once and locate that file section.

The agent only runs when ≥1 security-relevant file changed (existing if: gate), so it always has full security patches to review; non-security churn no longer inflates the context.

Measured effect

For a workflow/test-only PR (no security files), the pre-fetched block drops from ~22 KB → a few hundred bytes. For mixed PRs, only security-relevant patches are sent in full.

Notes

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The Security Guard agent's pre-fetched diff dumped full patches for every
changed file, so large refactor PRs sent oversized prompts (cost,
latency, and a larger first-request that can trip the upstream Copilot
provider). Rework the pre-fetch so it includes full patches only for
security-relevant files (matching the same path regex the relevance
gate uses), largest first, and lists every other changed file by name
only. A non-security file's patch no longer bloats the prompt (e.g. a
workflow/test-only PR drops from ~22 KB to a few hundred bytes).

Also stop instructing the agent to re-fetch the entire PR diff on
truncation: security-relevant patches are shown first, so it should only
fetch a still-missing security-relevant file via
get_pull_request_diff. Prompt copy and the truncation note updated to
match.

Recompiled the lock (security-relevant patch prioritization in the
pr-diff step); the COPILOT_DUMMY_BYOK placeholder fix from #5306 and the
gh-aw-mcpg image pin are preserved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 20, 2026 16:34
@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 97.62% 97.66% 📈 +0.04%
Statements 97.57% 97.61% 📈 +0.04%
Functions 98.85% 98.85% ➡️ +0.00%
Branches 93.22% 93.25% 📈 +0.03%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/workdir-setup.ts 92.7% → 94.5% (+1.82%) 92.7% → 94.5% (+1.82%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the Security Guard workflow’s pre-fetched PR diff to reduce initial prompt size and avoid oversized first requests, by prioritizing full patches for security-relevant files and omitting patches for non-security churn.

Changes:

  • Pre-fetch only includes full patches for security-relevant files (sorted largest-first) and lists all other changed files by name with +/- counts.
  • Updates truncation guidance and prompt instructions to avoid re-fetching the entire PR diff unnecessarily.
  • Regenerates security-guard.lock.yml to reflect the updated workflow source.
Show a summary per file
File Description
.github/workflows/security-guard.md Adjusts PR-file prefetch logic to inline only security-relevant patches and updates prompt/truncation instructions accordingly.
.github/workflows/security-guard.lock.yml Regenerated locked workflow reflecting the updated prefetch logic and prompt text.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 4

Comment on lines 69 to +86
DIFF_LIMIT=100000
SECURITY_RE='host-iptables|setup-iptables|squid-config|docker-manager|seccomp-profile|domain-patterns|entrypoint\.sh|Dockerfile|(^|/)containers/'
DIFF_TMP="$(mktemp)"
# Include full patches only for security-relevant files (largest first);
# list every other changed file by name so large non-security refactors
# don't bloat the prompt.
gh api "repos/${GH_REPO}/pulls/${PR_NUMBER}/files" --paginate --slurp \
| jq -r --arg re "$SECURITY_RE" '
([.[][]]) as $files
| ([$files[] | select(.filename | test($re))] | sort_by(-(.additions + .deletions))) as $sec
| ([$files[] | select(.filename | test($re) | not)]) as $other
| ( $sec[]
| "### " + .filename + " (+" + (.additions|tostring) + "/-" + (.deletions|tostring) + ") [security-relevant]\n" + (.patch // "(binary or no textual patch)") + "\n" ),
( if ($other | length) > 0
then "\n### Other changed files (not security-relevant — patches omitted to save context):\n"
+ ([$other[] | "- " + .filename + " (+" + (.additions|tostring) + "/-" + (.deletions|tostring) + ")"] | join("\n")) + "\n"
else empty end )
' > "$DIFF_TMP" || true
Comment thread .github/workflows/security-guard.md Outdated
## ⚡ Fast Path

Read the pre-fetched diff below first. If you see `[DIFF TRUNCATED ...]`, fetch full context once with `mcp__github__get_pull_request_diff` before deciding to noop. Only use the fast path when the full diff contains **no** security-weakening changes: no weakened DROP/REJECT or expanded ACCEPT, no egress/domain allowlist expansion, no firewall chain changes, no capability additions, no ACL regressions, no seccomp relaxations, no DNS/wildcard bypass, no input validation weakening, and no secrets. Then call `safeoutputs noop` immediately — do not read additional files or make further tool calls.
Read the pre-fetched diff below first. Security-relevant files are included in full; other changed files are listed by name only. If you see `[DIFF TRUNCATED ...]` and a **security-relevant** patch is missing, fetch just that file's context once with `mcp__github__get_pull_request_diff` before deciding to noop. Only use the fast path when the security-relevant changes contain **no** security-weakening changes: no weakened DROP/REJECT or expanded ACCEPT, no egress/domain allowlist expansion, no firewall chain changes, no capability additions, no ACL regressions, no seccomp relaxations, no DNS/wildcard bypass, no input validation weakening, and no secrets. Then call `safeoutputs noop` immediately — do not read additional files or make further tool calls.
Comment thread .github/workflows/security-guard.md Outdated
1. **Review the pre-fetched diff below** (security-relevant files in full; other files listed by name)
2. **Batch all independent reads** in a single tool-use block rather than making sequential calls
3. **Use ONLY the pre-fetched diff below.** Do NOT call `gh pr diff`, `gh pr view`, `gh api`, `git diff`, `git log`, or `git show`. Do NOT read files from the checkout. If `[DIFF TRUNCATED ...]` appears, call `mcp__github__get_pull_request_diff` once — then stop making tool calls and analyze inline.
3. **Use ONLY the pre-fetched diff below.** Do NOT call `gh pr diff`, `gh pr view`, `gh api`, `git diff`, `git log`, or `git show`. Do NOT read files from the checkout. If `[DIFF TRUNCATED ...]` appears and a security-relevant patch is missing, call `mcp__github__get_pull_request_diff` once — then stop making tool calls and analyze inline.
Comment thread .github/workflows/security-guard.md Outdated

**IMPORTANT: Be concise.** Report each security finding in ≤ 150 words. Maximum 5 findings total.
If `[DIFF TRUNCATED ...]` is present, fetch full context once with `mcp__github__get_pull_request_diff` before deciding to noop.
If `[DIFF TRUNCATED ...]` is present and a security-relevant patch is missing, fetch just that file's context once with `mcp__github__get_pull_request_diff` before deciding to noop.
@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

✅ Copilot review passed with no inline comments.

@lpcox Add the ready-for-aw label to this PR to trigger agentic CI smoke tests.

@lpcox

lpcox commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator Author

@copilot address review feedback

Copilot AI commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

@copilot address review feedback

Addressed in b7b21ac.

  • Excluded test paths (tests?/ and *.test.*) from the security-relevant patch inlining logic so container tests are not inlined as security patches.
  • Updated Fast Path / task instructions to clarify mcp__github__get_pull_request_diff returns the full PR diff and should be fetched once, then the missing security-relevant section located.
  • Mirrored the same prefetch logic and truncation wording update in security-guard.lock.yml.

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Claude passed

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Gemini completed. All facets verified. 💎

Smoke test completed with partial success. GitHub MCP tools missing. Connectivity, file writing, and bash testing PASSED.

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

🔌 Smoke Services — All services reachable! ✅

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded.

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Build Test Suite completed successfully!

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Claude Engine Validation

Check Status
API ✅ PASS
gh ✅ PASS
File ✅ PASS

Overall result: PASS

Generated by Smoke Claude for issue #5329 · 60.9 AIC · ⊞ 3.1K ·

@github-actions

Copy link
Copy Markdown
Contributor

🔐 Smoke Test: Copilot PAT Auth — PASS

Test Result
GitHub MCP connectivity ✅ PR #5328 fetched
GitHub.com HTTP ✅ 200
File write/read /tmp/gh-aw/agent/smoke-test-copilot-pat-27877921175.txt

Overall: PASS@lpcox
Auth mode: PAT (COPILOT_GITHUB_TOKEN)

🔑 PAT report filed by Smoke Copilot PAT

@github-actions

Copy link
Copy Markdown
Contributor

🔬 Smoke Test Results

Test Status
GitHub MCP connectivity ✅ PASS — listed PR #5328 successfully
GitHub.com HTTP ❌ FAIL — pre-step output not substituted
File write/read ❌ FAIL — pre-step output not substituted
Pre-fetched PR data ❌ FAIL — pre-step output not substituted

PR: perf(security-guard): prioritize security-relevant files in PR diff
Author: @lpcox

Overall: FAILsteps.smoke-data.outputs template variables were not substituted (workflow pre-step outputs unavailable).

📰 BREAKING: Report filed by Smoke Copilot

@github-actions

Copy link
Copy Markdown
Contributor

Copilot BYOK Smoke Test Results

✅ GitHub MCP connectivity
✅ GitHub.com HTTP 200
✅ File write/read test
✅ Direct BYOK inference (agent → api-proxy sidecar → api.githubcopilot.com)

Overall: PASS — Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY) via api-proxy.

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions

Copy link
Copy Markdown
Contributor
  • fix(smoke-claude): raise turn budget to 8 and fix add_comment usage — ✅
  • [WIP] Refactor extract functions in token-parsers file — ✅
  • GitHub title check — ✅
  • Temp file write/read — ✅
  • Build (npm ci && npm run build) — ✅
    Overall: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test Results: Gemini Engine

  1. GitHub MCP Testing: ❌ (MCP tools not available)
  2. GitHub.com Connectivity: ✅ (HTTP 200 via proxy)
  3. File Writing Testing: ✅
  4. Bash Tool Testing: ✅

Overall Status: FAIL

Note: MCP tools were missing from the agent environment. Connectivity was verified using an explicit proxy (172.30.0.10:3128).

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

@github-actions

Copy link
Copy Markdown
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3 ❌ NO
Node.js v24.16.0 v22.22.3 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall: ❌ FAILED — Python and Node.js versions differ between host and chroot environments.

Tested by Smoke Chroot

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: API Proxy OpenTelemetry Tracing

Scenario Result Notes
1. Module Loading otel.js loads; exports startRequestSpan, setTokenAttributes, setBudgetAttributes, endSpan, endSpanError, shutdown, isEnabled
2. Test Suite 39/39 passed (otel.test.js)
3. Env Var Forwarding api-proxy-service-config.ts forwards GH_AW_OTLP_ENDPOINTS, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID, OTEL_SERVICE_NAME
4. Token Tracker Integration onUsage callback present in token-tracker-http.js; onSpanEnd hook also wired
5. OTEL Diagnostics 1 span exported to /tmp/gh-aw/otel.jsonl (service.name=gh-aw.smoke-otel-tracing, run 27877921158)

All scenarios pass. ✅

📡 OTel tracing validated by Smoke OTel Tracing

@github-actions

Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #5329 · 38.1 AIC · ⊞ 7.7K ·

@github-actions

Copy link
Copy Markdown
Contributor

@lpcox

Smoke Test Results:

  • GitHub MCP PR fetch: ✅
  • GitHub HTTP connectivity: ✅
  • File I/O test: ✅
  • Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw): ✅

Overall: PASS

🔑 BYOK (AOAI api-key) report filed by Smoke Copilot BYOK AOAI (api-key)

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test Results — FAIL ❌

Check Result
Redis PING ❌ No response (host.docker.internal:6379 unreachable)
PostgreSQL pg_isready no response
PostgreSQL SELECT 1 ❌ No response

host.docker.internal resolves to 172.17.0.1 but service ports 6379 and 5432 are unreachable. Services may not be running or are not exposed to this network.

🔌 Service connectivity validated by Smoke Services

@github-actions

Copy link
Copy Markdown
Contributor

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)

@lpcox lpcox merged commit 3c8a70d into main Jun 20, 2026
86 of 90 checks passed
@lpcox lpcox deleted the fix/security-guard-prompt-size branch June 20, 2026 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants