Executive Summary
- 4 runs sampled across 4 distinct workflows (Daily Safe Output Integrator, Test Quality Sentinel, Matt Pocock Skills Reviewer, Failure Investigator)
- Median first-request: 21,450 chars | P95: 30,318 chars
- Largest bloat: DSOI at 30,318 chars with 13 near-identical inline sub-agent stubs (~15,000 chars, 49% of prompt)
- Critical hidden bloat: MPSR prompt.txt shows 14,861 chars but actual first-request = 27,826 tokens (char/tok=0.53), because the runtime injects all 11
.github/agents/ files (~140KB) regardless of workflow scope
Note: No event-logs.jsonl found; sizes are from aw-prompts/prompt.txt. Token counts are from token-usage.jsonl first-entry.
Highest-Leverage Changes
- DSOI: Parameterize 13 type-specific inline sub-agents into a template table — saves ~13,000 chars (43%)
- All Copilot workflows: Filter irrelevant
.github/agents/ injection — create-safe-output-type.agent.md (23,730 bytes) + custom-engine-implementation.agent.md (27,203 bytes) are irrelevant to PR-review workflows
- TQS: Move binary-penalty scoring algorithm to a
steps: script — 10,118-char embedded formula can be pre-computed
- DSOI: Replace Phase 5 static Go field-mapping table with a source-derive instruction — 1,530 chars of static data already derivable from
safe_outputs_validation_config.go
- shared/reporting.md: Remove design-philosophy and example prose — 1,200 chars of "Airbnb-inspired" copy across all importing workflows
- MPSR: Compress Steps 4 & 5 to tight bullets — 2,686 chars → ~1,486 chars; high-AIC workflow (372, 22 turns)
Key Metrics
| Metric |
Value |
| Sampled runs |
4 |
| Distinct workflows |
4 |
| Median chars |
21,450 |
| P95 chars |
30,318 |
| Largest request |
30,318 chars (DSOI) |
| Avg char/token ratio |
1.33 |
| Lowest char/tok |
0.53 — Matt Pocock Skills Reviewer |
Per-Run First-Request Metrics
| Run |
Workflow |
Chars |
Lines |
Hdgs |
Fences |
Dup% |
Input Tokens |
Char/Tok |
AIC |
| §27438027894 |
Failure Investigator |
12,212 |
207 |
21 |
2 |
0.0% |
5,117 |
2.39 |
399.95 |
| §27439111642 |
Matt Pocock Skills Reviewer |
14,861 |
254 |
20 |
10 |
0.0% |
27,826 |
0.53 |
372.03 |
| §27438448655 |
Test Quality Sentinel |
28,038 |
512 |
51 |
26 |
0.7% |
27,245 |
1.03 |
221.50 |
| §27437733590 |
Daily Safe Output Integrator |
30,318 |
831 |
68 |
50 |
0.6% |
22,245 |
1.36 |
217.48 |
Repeated Ambient Context Signals
shared/reporting.md: 1,934 chars of ## Report Structure Guidelines injected identically into TQS and DSOI
- 13 DSOI type sub-agents:
# Test Copilot Update Pull Request etc. repeat same ## Task / ## Steps / ## Output scaffold 13× (only type name differs)
- TQS scoring verbosity:
### Calibration (1,519) + ### Red Flags to Detect (1,863) + ### Verdict (1,127) = 4,509 chars that could be compressed
<system> boilerplate: 5,273–6,828 chars of runtime security/safe-output prose across all 4 runs
- Most repeated fragments:
for sotype in safe_output_types: (×2), cat /tmp/gh-aw/agent/diff-numstat.txt (×2)
Deterministic Analysis Output
Longest sections per workflow:
- DSOI:
## Phase 5: Add Go Compiler Tests = 2,745 chars
- TQS:
### Red Flags to Detect = 1,863 chars; embedded scoring formula block = 10,118 chars
- MPSR:
### Step 4: Review Using Selected Skills = 1,602 chars
- FI (Claude Code):
## Tone Variant Instructions = 801 chars; only 5,117 tokens — most efficient engine by far
Key structural insight: FI uses Claude Code engine (char/tok=2.39, 5,117 tokens); all Copilot-engine workflows are 4–5× more expensive at first request. Agent injection is the primary driver for MPSR's 0.53 char/tok ratio.
Recommendations by Category
Workflow Markdown
1. DSOI — Parameterize 13 type-specific sub-agents [HIGH] · saves ~13,000 chars
- Replace 13
# Test Copilot <TypeName> blocks with one compact markdown table (type | perm-scope | task | tool-key) + existing template
- Safe to apply immediately
2. TQS — Extract scoring algorithm to steps: script [HIGH] · saves ~8,000–10,000 chars
- Move
# Binary penalty: deduct all 10 points... (10,118 chars) to a pre-agent shell step; emit result to /tmp/gh-aw/agent/inflation-score.txt
- Needs functional testing before applying
3. DSOI — Replace static Phase 5 field-mapping table with source-derive instruction [MEDIUM] · saves ~1,530 chars
- 13-entry lookup table is fully derivable from
pkg/workflow/safe_outputs_validation_config.go; replace with grep instruction
- Safe to apply immediately
4. shared/reporting.md — Remove design-philosophy and example prose [MEDIUM] · saves ~1,200 chars per importer
- Delete
### Design Principles (Airbnb-Inspired) and ### Example Report Structure blocks
- Safe to apply; benefits every workflow importing
reporting.md
Skills
5. MPSR — Compress Steps 4 & 5 to action bullets [MEDIUM] · saves ~1,200 chars
- Collapse 1,602-char Step 4 and 1,084-char Step 5 to ≤3 bullets each
- Directly reduces AIC across 22 turns per run
Agents
6. All Copilot workflows — Scope-filter .github/agents/ injection [HIGH]
- Add
applyTo filtering or an explicit agents: allowlist in workflow frontmatter
create-safe-output-type.agent.md (23,730 bytes) + custom-engine-implementation.agent.md (27,203 bytes) irrelevant to PR-review workflows = 50,933 chars of avoidable injection
- Needs runtime support verification
References
Generated by 🌫️ Daily Ambient Context Optimizer · 1.9K AIC · ⌖ 26.5 AIC · ⊞ 21.8K · ◷
Executive Summary
.github/agents/files (~140KB) regardless of workflow scopeHighest-Leverage Changes
.github/agents/injection —create-safe-output-type.agent.md(23,730 bytes) +custom-engine-implementation.agent.md(27,203 bytes) are irrelevant to PR-review workflowssteps:script — 10,118-char embedded formula can be pre-computedsafe_outputs_validation_config.goKey Metrics
Per-Run First-Request Metrics
Repeated Ambient Context Signals
shared/reporting.md: 1,934 chars of## Report Structure Guidelinesinjected identically into TQS and DSOI# Test Copilot Update Pull Requestetc. repeat same## Task / ## Steps / ## Outputscaffold 13× (only type name differs)### Calibration(1,519) +### Red Flags to Detect(1,863) +### Verdict(1,127) = 4,509 chars that could be compressed<system>boilerplate: 5,273–6,828 chars of runtime security/safe-output prose across all 4 runsfor sotype in safe_output_types:(×2),cat /tmp/gh-aw/agent/diff-numstat.txt(×2)Deterministic Analysis Output
Longest sections per workflow:
## Phase 5: Add Go Compiler Tests= 2,745 chars### Red Flags to Detect= 1,863 chars; embedded scoring formula block = 10,118 chars### Step 4: Review Using Selected Skills= 1,602 chars## Tone Variant Instructions= 801 chars; only 5,117 tokens — most efficient engine by farKey structural insight: FI uses Claude Code engine (char/tok=2.39, 5,117 tokens); all Copilot-engine workflows are 4–5× more expensive at first request. Agent injection is the primary driver for MPSR's 0.53 char/tok ratio.
Recommendations by Category
Workflow Markdown
1. DSOI — Parameterize 13 type-specific sub-agents
[HIGH]· saves ~13,000 chars# Test Copilot <TypeName>blocks with one compact markdown table (type | perm-scope | task | tool-key) + existing template2. TQS — Extract scoring algorithm to
steps:script[HIGH]· saves ~8,000–10,000 chars# Binary penalty: deduct all 10 points...(10,118 chars) to a pre-agent shell step; emit result to/tmp/gh-aw/agent/inflation-score.txt3. DSOI — Replace static Phase 5 field-mapping table with source-derive instruction
[MEDIUM]· saves ~1,530 charspkg/workflow/safe_outputs_validation_config.go; replace withgrepinstruction4. shared/reporting.md — Remove design-philosophy and example prose
[MEDIUM]· saves ~1,200 chars per importer### Design Principles (Airbnb-Inspired)and### Example Report Structureblocksreporting.mdSkills
5. MPSR — Compress Steps 4 & 5 to action bullets
[MEDIUM]· saves ~1,200 charsAgents
6. All Copilot workflows — Scope-filter
.github/agents/injection[HIGH]applyTofiltering or an explicitagents:allowlist in workflow frontmattercreate-safe-output-type.agent.md(23,730 bytes) +custom-engine-implementation.agent.md(27,203 bytes) irrelevant to PR-review workflows = 50,933 chars of avoidable injectionReferences