Agentic Workflow Portfolio Yield Report
Analysis Date: 2026-05-21
Portfolio Size: 233 workflows
Overall Portfolio Yield: -86.52 (negative)
Evidence Quality: Low (0% telemetry validation)
Executive Summary
The agentic workflow portfolio is in critical condition with systemic architectural issues:
- Negative portfolio yield (-86.52) indicates workflows consume more resources than they deliver value
- Zero telemetry coverage (0.0%) prevents evidence-based decision making despite 88% declared observability
- Extreme agentic fraction (97%) inverts best practices - deterministic tasks run as agentic workflows
- Massive overlap drag (346.15) from 186 merge candidates representing 80% of the portfolio
- Only 3 workflows (1.3%) meet criteria for retention without modification
The portfolio exhibits fragmentation, under-instrumentation, and lack of governance. Immediate consolidation and instrumentation are required to restore portfolio health.
Portfolio Health
| Metric |
Value |
Status |
Analysis |
| Portfolio Yield |
-86.52 |
🔴 Critical |
Negative yield indicates systematic value destruction |
| Workflow Count |
233 |
⚠️ Warning |
High count with massive overlap suggests proliferation over reuse |
| Telemetry Coverage |
0.0% |
🔴 Critical |
Zero validated observability prevents evidence-based governance |
| Agentic Fraction |
97.07% |
🔴 Critical |
Inverted architecture - deterministic work running as agentic |
| Overlap Drag |
346.15 |
🔴 Critical |
186 merge candidates (80% of portfolio) |
| Evidence Quality |
Low |
🔴 Critical |
No telemetry validation of workflow outcomes |
| Portfolio Risk |
0.66 |
⚠️ Warning |
High risk across portfolio |
| Maintenance Drag |
0.89 |
🔴 Critical |
Workflows difficult to maintain and validate |
| Trust Concentration |
0.24 |
🔴 Critical |
Low confidence in workflow reliability |
| Governance Drag |
0.91 |
🔴 Critical |
Broad scope, missing telemetry, high agentic fractions |
| Fragmentation |
1.0 |
🔴 Critical |
Maximum fragmentation - workflows operate in isolation |
| Reuse |
0.89 |
⚠️ Warning |
High overlap suggests copy-paste instead of composition |
Interpretation
Portfolio-level crisis: Every health metric is in warning or critical status. The portfolio exhibits classic symptoms of unconstrained growth without governance:
- Proliferation over composition - 80% merge candidates indicate copy-paste workflow creation
- Agentic-by-default - 97% agentic fraction shows deterministic work running as expensive AI tasks
- Evidence vacuum - 0% telemetry prevents empirical optimization
- Isolation architecture - Workflows don't compose or share context (episode metrics: 0)
Workflow Portfolio
Keep (3 workflows - 1.3%)
These workflows demonstrate positive yield, reasonable risk profiles, and clear value propositions:
| Workflow |
Yield |
Risk |
Agentic % |
Reason |
aw-portfolio-yield.md |
0.0835 |
0.36 |
36.55% |
Highest yield - Provides critical portfolio visibility with balanced agentic/deterministic mix |
example-permissions-warning.md |
N/A |
N/A |
N/A |
Documentation workflow with clear value and low complexity |
terminal-stylist.md |
N/A |
N/A |
N/A |
Focused, single-purpose workflow with measurable output |
Retire (7 workflows - 3.0%)
Low-yield workflows with maximum risk and unclear value propositions:
| Workflow |
Yield |
Risk |
Reason |
ace-editor.md |
0.0017 |
1.0 |
Lowest yield, maximum risk, unclear value |
ab-testing-advisor.md |
0.0021 |
1.0 |
Second-lowest yield, no instrumentation |
test-workflow.md |
0.0034 |
1.0 |
Test artifact with no production value |
copilot-cli-deep-research.md |
0.0035 |
1.0 |
Expensive research with no quality metrics |
daily-doc-updater.md |
0.0035 |
1.0 |
Overlaps with other doc workflows |
daily-news.md |
0.0035 |
1.0 |
Unclear connection to repository value |
daily-semgrep-scan.md |
0.0035 |
1.0 |
Security scanning should be deterministic CI |
Revise (10 workflows - 4.3%)
Workflows with potential value but requiring architectural changes:
Revise Candidates (click to expand)
| Workflow |
Issue |
Recommendation |
constraint-solving-potd.md |
High potential but unbounded scope |
Add deterministic guardrails, define success criteria |
daily-astrostylelite-markdown-spellcheck.md |
Agentic spell-checking |
Convert to deterministic spell-check tools |
daily-model-inventory.md |
Agentic inventory management |
Should be deterministic with minimal interpretation |
daily-spdd-spec-planner.md |
Unclear success criteria |
Define measurable planning outcomes |
dependabot-repair.md |
High-value but high-risk |
Add validation gates and rollback mechanisms |
deployment-incident-monitor.md |
Agentic monitoring |
Use deterministic alerting thresholds |
dictation-prompt.md |
Niche use case |
Clarify value proposition or retire |
smoke-service-ports.md |
Agentic infrastructure testing |
Convert to deterministic port checks |
video-analyzer.md |
Expensive with no cost controls |
Add cost limits and quality metrics |
weekly-editors-health-check.md |
Agentic health checks |
Use deterministic metrics dashboards |
Merge (186 clusters - 79.8%)
The precompute identified 186 merge candidates representing 80% of the portfolio. Top consolidation opportunities:
Critical Overlap Clusters
| Overlap % |
Workflows |
Consolidation Path |
| 97% |
daily-grafana-otel-instrumentation-advisor.md
daily-otel-instrumentation-advisor.md |
Immediate merge - Near-duplicate workflows |
| 78% |
smoke-agent-all-merged.md
smoke-agent-all-none.md
smoke-agent-public-none.md |
Consolidate to parametrized smoke test |
| 80% |
smoke-crush.md
smoke-gemini.md
smoke-opencode.md
smoke-pi.md |
Single smoke test with engine parameter |
Consolidation Themes
- OTEL workflows - 2 workflows → 1 consolidated advisor
- Smoke tests - 7+ engine tests → 1 parametrized framework
- Documentation - 5+ doc workflows → 1 with operational modes
- PR analysis - 3+ analyzers → 1 multi-aspect analyzer
- Daily reports - 10+ reports → configurable reporting framework
Instrument (27 workflows - 11.6%)
Workflows lacking observability despite declared instrumentation:
Instrumentation Gaps (click to expand)
| Workflow |
Missing Telemetry |
ab-testing-advisor.md |
Success/failure tracking for recommendations |
copilot-cli-deep-research.md |
Research quality and actionability metrics |
daily-agentrx-trace-optimizer.md |
Optimization effectiveness metrics |
daily-compiler-quality.md |
Quality improvement metrics |
daily-doc-healer.md |
Documentation improvement metrics |
daily-compiler-threat-spec-optimizer.md |
Threat detection accuracy |
daily-safe-output-integrator.md |
Integration success rates |
daily-team-status.md |
Status report accuracy |
delight.md |
User satisfaction metrics |
developer-docs-consolidator.md |
Consolidation effectiveness |
discussion-task-miner.md |
Task extraction quality |
go-fan.md |
Code suggestion acceptance |
go-logger.md |
Logger implementation quality |
instructions-janitor.md |
Cleanup effectiveness |
layout-spec-maintainer.md |
Spec compliance metrics |
pr-description-caveman.md |
Description quality improvement |
sergo.md |
Service generation quality |
spec-enforcer.md |
Enforcement action success rates |
spec-librarian.md |
Library organization metrics |
step-name-alignment.md |
Alignment improvement metrics |
typist.md |
Type correction accuracy |
ubuntu-image-analyzer.md |
Image analysis quality |
workflow-skill-extractor.md |
Skill extraction accuracy |
| (Plus 4 more - see precompute for full list) |
|
Overlap Clusters
The portfolio contains only 3 detected overlap clusters (precompute threshold likely filters low-overlap pairs). The detected clusters show extreme overlap:
Cluster 1: OTEL Instrumentation (97% overlap) 🔴
Workflows:
.github/workflows/daily-grafana-otel-instrumentation-advisor.md
.github/workflows/daily-otel-instrumentation-advisor.md
Analysis: Near-duplicate workflows providing identical OTEL instrumentation advice. One appears to be Grafana-specific variant.
Action: Immediate consolidation required - Merge into single workflow with optional Grafana integration.
Cluster 2: Agent Smoke Tests (78% overlap) ⚠️
Workflows:
.github/workflows/smoke-agent-all-merged.md
.github/workflows/smoke-agent-all-none.md
.github/workflows/smoke-agent-public-none.md
Analysis: Smoke tests varying only by agent configuration (all-merged vs all-none vs public-none).
Action: Consolidate to parametrized smoke test with configuration matrix.
Cluster 3: Engine Smoke Tests (80% overlap) ⚠️
Workflows:
.github/workflows/smoke-crush.md
.github/workflows/smoke-gemini.md
.github/workflows/smoke-opencode.md
.github/workflows/smoke-pi.md
Analysis: Identical smoke test structure with different engine parameters.
Action: Single parametrized smoke test with engine as input variable.
Hidden Overlap
The precompute reports 186 total merge candidates but only 3 clusters. This suggests:
- Threshold filtering - Many pairs have 40-70% overlap but don't reach cluster threshold
- Topic-based overlap - Workflows share approaches without exact duplication
- Copy-paste proliferation - Similar workflows created independently instead of using shared patterns
Recommendation: Lower cluster detection threshold to surface medium-overlap pairs for consolidation review.
Episode-Level Observations
Finding: No episode-level metrics detected in precompute data.
Implications
- Workflows operate in isolation - No coordinated sequences (e.g., analyze → recommend → implement)
- Missing shared state - Workflows cannot hand off context or results
- No composition patterns - Each workflow is self-contained black box
- Fragmentation by design - Architecture doesn't support workflow chains
Missed Opportunities
Potential episode patterns that could reduce fragmentation:
- Code quality episode: Detect issue → Generate fix → Validate → Deploy
- Documentation episode: Identify gap → Draft content → Review → Publish
- Performance episode: Detect bottleneck → Profile → Optimize → Benchmark
- Security episode: Scan → Triage → Remediate → Verify
Recommendation
Introduce episode coordination primitives:
- Shared state mechanism (cache-memory with versioning)
- Workflow handoff protocol (outputs → inputs with schema)
- Episode orchestrator pattern (parent workflow coordinating children)
- Success criteria across episode (not just individual workflows)
Organizational Health Signals
| Signal |
Value |
Status |
Interpretation |
| Fragmentation |
1.0 |
🔴 Critical |
Maximum fragmentation - zero workflow composition |
| Governance Drag |
0.91 |
🔴 Critical |
High management overhead from broad scope, missing telemetry |
| Reuse |
0.89 |
⚠️ Warning |
High score paradoxically indicates overlap, not healthy reuse |
| Trust Concentration |
0.24 |
🔴 Critical |
Low confidence distribution across portfolio |
Analysis
Fragmentation (1.0): Perfect fragmentation score indicates workflows don't compose, share context, or coordinate. This is inverse of healthy microservices architecture where composition is key.
Governance Drag (0.91): Near-maximum governance cost from:
- Broad, unfocused workflow scopes
- Missing telemetry preventing empirical optimization
- High agentic fractions requiring human validation
Reuse (0.89): High reuse score is misleading - it reflects overlap (80% merge candidates), not healthy composition. True reuse would be shared libraries/actions, not duplicated workflows.
Trust Concentration (0.24): Low trust indicates outcomes are unpredictable. Without telemetry validation, there's no empirical basis for confidence.
Organizational Patterns
The health signals reveal a proliferation anti-pattern:
- New problem → Create new workflow (no reuse checking)
- Similar problem → Copy existing workflow (overlap instead of abstraction)
- No telemetry → No feedback loop → No optimization
- Agentic-by-default → High cost, low predictability
Root cause: Missing architectural governance and composition primitives.
Deterministic vs Agentic Findings
Precompute (Deterministic) Findings
The deterministic analysis provided objective metrics:
| Finding |
Value |
Interpretation |
| Workflow count |
233 |
Large portfolio |
| Average agentic fraction |
97.07% |
Inverted architecture |
| Telemetry coverage |
0.0% |
No validated observability |
| Merge candidates |
186 (80%) |
Massive consolidation opportunity |
| Portfolio yield |
-86.52 |
Negative value creation |
| Overlap drag |
346.15 |
Extreme redundancy |
| Evidence quality |
Low |
No empirical validation |
Deterministic strength: Objective metrics free from interpretation bias.
Agent (Agentic) Interpretation
The semantic analysis adds context and causality:
- Architectural inversion: Portfolio treats agentic AI as default tool instead of exception for complex judgment
- Evidence vacuum: Systematic under-instrumentation prevents empirical governance
- Copy-paste culture: Overlap clusters suggest workflow creation without reuse patterns
- Missing composition: No episode-level coordination or shared abstractions
- Unconstrained proliferation: 233 workflows without consolidation forcing function
Agentic strength: Pattern recognition and root cause hypothesis.
Complementary Value
- Deterministic: "What" (objective state, quantified problems)
- Agentic: "Why" and "How to fix" (causality, recommendations)
The precompute correctly identified all critical issues. The agentic interpretation adds actionable remediation paths.
Validation
Agreement: Both layers agree on critical status and primary issues.
Divergence: None detected - agentic interpretation aligns with deterministic findings.
Confidence: High - objective metrics support interpretive conclusions.
Highest-Value Actions
Immediate (This Week)
-
🔴 Consolidate OTEL workflows (97% overlap)
- Merge
daily-grafana-otel-instrumentation-advisor.md and daily-otel-instrumentation-advisor.md
- Expected yield improvement: +1-2 points from reduced drag
-
🔴 Retire 7 low-yield workflows (yield < 0.004)
- Remove:
ace-editor, test-workflow, copilot-pr-prompt-analysis, daily-news, daily-semgrep-scan, daily-sentrux-report, technical-doc-writer
- Expected yield improvement: +3-5 points from reduced maintenance drag
-
🔴 Instrument top 5 workflows with observability
- Add telemetry to:
daily-compiler-quality, daily-doc-healer, ab-testing-advisor, copilot-cli-deep-research, daily-agentrx-trace-optimizer
- Expected improvement: Evidence quality → Medium
Short-term (This Month)
-
⚠️ Consolidate smoke tests (78-80% overlap)
- Create parametrized smoke test framework
- Merge 7+ engine-specific tests into single configurable workflow
- Expected yield improvement: +10-15 points from overlap reduction
-
⚠️ Convert deterministic workflows from agentic
- Workflows: spell-check, port tests, health checks, semgrep scans
- Convert to bash/action-based deterministic implementations
- Expected improvement: Agentic fraction → 85% (from 97%)
Strategic (This Quarter)
-
🟡 Establish portfolio governance
- Yield threshold: Retire workflows < 0.01 without justification
- Mandatory telemetry: All new workflows must declare observability
- Consolidation review: Monthly overlap cluster analysis
- Expected improvement: Governance drag → 0.5 (from 0.91)
-
🟡 Introduce episode coordination
- Design shared state mechanism
- Create workflow handoff protocol
- Identify 3-5 episode candidates for pilot
- Expected improvement: Fragmentation → 0.7 (from 1.0)
-
🟡 Instrument remaining 22 workflows
- Complete instrumentation gap closure
- Achieve 100% observability declaration coverage
- Validate telemetry with Tempo integration
- Expected improvement: Telemetry coverage → 100% (from 0%)
Success Metrics
| Metric |
Current |
Target (3mo) |
Target (6mo) |
| Portfolio Yield |
-86.52 |
-50.0 |
+10.0 |
| Workflow Count |
233 |
150 |
100 |
| Agentic Fraction |
97% |
85% |
70% |
| Telemetry Coverage |
0% |
50% |
100% |
| Merge Candidates |
186 |
50 |
10 |
| Evidence Quality |
Low |
Medium |
High |
Retirement Candidates
The following 7 workflows are recommended for immediate retirement based on low yield (< 0.004) and maximum risk (1.0):
| Rank |
Workflow |
Yield |
Risk |
Rationale |
| 1 |
ace-editor.md |
0.0017 |
1.0 |
Lowest yield in portfolio, unclear value proposition |
| 2 |
ab-testing-advisor.md |
0.0021 |
1.0 |
No instrumentation, expensive agentic analysis for A/B testing |
| 3 |
test-workflow.md |
0.0034 |
1.0 |
Test artifact - no production value |
| 4 |
copilot-cli-deep-research.md |
0.0035 |
1.0 |
Expensive research with no quality/actionability metrics |
| 5 |
daily-doc-updater.md |
0.0035 |
1.0 |
Overlaps with other documentation workflows |
| 6 |
daily-news.md |
0.0035 |
1.0 |
Unclear connection to repository value |
| 7 |
daily-semgrep-scan.md |
0.0035 |
1.0 |
Should be deterministic CI, not agentic workflow |
Retirement Process
- Announce retirement in workflow documentation
- Monitor usage for 1 week (check GitHub Actions runs)
- Archive workflows to
archived/ directory (don't delete)
- Document retirement reason in commit message
- Track yield impact after 2 weeks
Expected Impact
- Yield improvement: +3 to +5 points (reduced maintenance drag)
- Portfolio clarity: Remove noise from low-value experiments
- Resource savings: Eliminate 7 workflow slots from scheduled runs
Consolidation Opportunities
Theme-Based Consolidation
| Theme |
Current |
Target |
Workflows → Consolidated |
| OTEL Instrumentation |
2 |
1 |
daily-grafana-otel-* + daily-otel-* → otel-instrumentation-advisor.md |
| Smoke Tests |
7+ |
1 |
All engine-specific → smoke-test-suite.md (parametrized) |
| Documentation |
5+ |
1 |
doc-updater, doc-healer, technical-doc-writer → doc-manager.md (modes) |
| PR Analysis |
3+ |
1 |
copilot-pr-* workflows → pr-analyzer.md (multi-aspect) |
| Daily Reports |
10+ |
1 |
Various daily-*-report → reporting-framework.md (configurable) |
Parametrization Strategy
Convert workflow families to single parametrized workflows:
Before (4 workflows):
# smoke-crush.md
engine: crush
# smoke-gemini.md
engine: gemini
# smoke-opencode.md
engine: opencode
# smoke-pi.md
engine: pi
After (1 workflow):
# smoke-test-suite.md
on:
workflow_dispatch:
inputs:
engine:
type: choice
options: [crush, gemini, opencode, pi, all]
schedule:
- cron: "0 8 * * *" # Run all engines daily
engine: ${{ inputs.engine || 'copilot' }}
Mode-Based Consolidation
Convert similar workflows to multi-mode single workflow:
Before (3 workflows):
daily-doc-updater.md (update existing docs)
daily-doc-healer.md (fix doc errors)
technical-doc-writer.md (write new docs)
After (1 workflow):
# doc-manager.md
on:
workflow_dispatch:
inputs:
mode:
type: choice
options: [update, heal, write]
# Workflow adapts behavior based on mode input
Expected Consolidation Impact
| Metric |
Before |
After |
Improvement |
| Workflow count |
233 |
~100 |
-57% |
| Overlap drag |
346.15 |
~50 |
-86% |
| Maintenance surface |
233 files |
~100 files |
-57% |
| Portfolio yield |
-86.52 |
~-40 |
+54% |
Instrumentation Gaps
Current State
- Declared observability: 88.41% (206 workflows declare telemetry)
- Validated observability: 0.0% (zero workflows have validated telemetry)
- Gap: 88.41 percentage points
Root Causes
- Missing Grafana access: Tempo datasource not accessible for validation
- No telemetry validation: Workflows declare
observability: but don't emit traces
- No feedback loop: Without telemetry, workflows can't self-optimize
- No portfolio dashboard: Can't visualize aggregate workflow health
Critical Gaps by Category
| Category |
Workflows |
Missing Metrics |
| Optimization workflows |
5 |
Optimization effectiveness, before/after metrics |
| Quality workflows |
4 |
Quality improvement scores, defect detection rates |
| Research workflows |
3 |
Research quality, actionability, citation accuracy |
| Documentation workflows |
5 |
Documentation coverage, readability improvement |
| Analysis workflows |
10 |
Analysis accuracy, recommendation acceptance |
Instrumentation Requirements
Every workflow should emit:
-
Execution metrics (already captured by gh-aw runtime):
- Duration, token count, cost
- Success/failure status
- Error categories
-
Outcome metrics (workflow-specific):
- Quality: accuracy, precision, recall
- Value: recommendations accepted, issues fixed
- Cost: resources consumed vs value delivered
-
Portfolio metrics (aggregated):
- Yield trending over time
- Cross-workflow dependencies
- Episode success rates
Remediation Plan
Phase 1: Restore telemetry pipeline (Week 1)
- Fix Grafana Tempo datasource access
- Validate trace emission for top 10 workflows
- Create portfolio dashboard in Grafana
Phase 2: Instrument high-value workflows (Weeks 2-4)
- Add outcome metrics to 27 instrumentation-gap workflows
- Validate metrics emission with test runs
- Document instrumentation patterns for reuse
Phase 3: Mandatory instrumentation (Month 2)
- Require observability for all new workflows
- Backfill remaining workflows
- Achieve 100% validated telemetry coverage
Deterministic Portfolio JSON
The complete precomputed portfolio analysis is available at:
/tmp/aw-yield-precompute.json
Size: 7.8 MB (199,268 lines)
Structure:
{
"workflows": [...], // 233 workflow metrics
"portfolio_metrics": {...}, // Aggregate scores
"overlap_clusters": [...], // 3 detected clusters
"overlap_pairs": [...], // Pairwise overlap matrix
"recommendations_seed": {...}, // 233 recommendations
"telemetry_coverage": {...}, // Coverage analysis
"organizational_health_signals": {...}, // Health metrics
"episode_metrics": [] // (empty - no episodes)
}
Key Sections for Review:
.portfolio_metrics - Overall portfolio health scores
.workflows[] | sort_by(.yield) - Yield-ranked workflow list
.overlap_clusters - Consolidation opportunities
.recommendations_seed - Categorized recommendations
.organizational_health_signals - Governance metrics
Access:
The precompute JSON is attached to this workflow run as an artifact and available in the workflow runner at /tmp/aw-yield-precompute.json.
Metadata
- Generated by:
aw-portfolio-yield workflow
- Precompute source:
/tmp/aw-yield-precompute.json (7.8 MB)
- Agent analysis:
/tmp/gh-aw/portfolio-yield-agent.json
- Telemetry validation: ❌ Grafana Tempo datasource not accessible
- Evidence basis: Deterministic precompute only (no live telemetry validation)
Generated by 📊 Agentic Workflow Portfolio Yield · ● 1.3M · ◷
Agentic Workflow Portfolio Yield Report
Analysis Date: 2026-05-21
Portfolio Size: 233 workflows
Overall Portfolio Yield: -86.52 (negative)
Evidence Quality: Low (0% telemetry validation)
Executive Summary
The agentic workflow portfolio is in critical condition with systemic architectural issues:
The portfolio exhibits fragmentation, under-instrumentation, and lack of governance. Immediate consolidation and instrumentation are required to restore portfolio health.
Portfolio Health
Interpretation
Portfolio-level crisis: Every health metric is in warning or critical status. The portfolio exhibits classic symptoms of unconstrained growth without governance:
Workflow Portfolio
Keep (3 workflows - 1.3%)
These workflows demonstrate positive yield, reasonable risk profiles, and clear value propositions:
aw-portfolio-yield.mdexample-permissions-warning.mdterminal-stylist.mdRetire (7 workflows - 3.0%)
Low-yield workflows with maximum risk and unclear value propositions:
ace-editor.mdab-testing-advisor.mdtest-workflow.mdcopilot-cli-deep-research.mddaily-doc-updater.mddaily-news.mddaily-semgrep-scan.mdRevise (10 workflows - 4.3%)
Workflows with potential value but requiring architectural changes:
Revise Candidates (click to expand)
constraint-solving-potd.mddaily-astrostylelite-markdown-spellcheck.mddaily-model-inventory.mddaily-spdd-spec-planner.mddependabot-repair.mddeployment-incident-monitor.mddictation-prompt.mdsmoke-service-ports.mdvideo-analyzer.mdweekly-editors-health-check.mdMerge (186 clusters - 79.8%)
The precompute identified 186 merge candidates representing 80% of the portfolio. Top consolidation opportunities:
Critical Overlap Clusters
daily-grafana-otel-instrumentation-advisor.mddaily-otel-instrumentation-advisor.mdsmoke-agent-all-merged.mdsmoke-agent-all-none.mdsmoke-agent-public-none.mdsmoke-crush.mdsmoke-gemini.mdsmoke-opencode.mdsmoke-pi.mdConsolidation Themes
Instrument (27 workflows - 11.6%)
Workflows lacking observability despite declared instrumentation:
Instrumentation Gaps (click to expand)
ab-testing-advisor.mdcopilot-cli-deep-research.mddaily-agentrx-trace-optimizer.mddaily-compiler-quality.mddaily-doc-healer.mddaily-compiler-threat-spec-optimizer.mddaily-safe-output-integrator.mddaily-team-status.mddelight.mddeveloper-docs-consolidator.mddiscussion-task-miner.mdgo-fan.mdgo-logger.mdinstructions-janitor.mdlayout-spec-maintainer.mdpr-description-caveman.mdsergo.mdspec-enforcer.mdspec-librarian.mdstep-name-alignment.mdtypist.mdubuntu-image-analyzer.mdworkflow-skill-extractor.mdOverlap Clusters
The portfolio contains only 3 detected overlap clusters (precompute threshold likely filters low-overlap pairs). The detected clusters show extreme overlap:
Cluster 1: OTEL Instrumentation (97% overlap) 🔴
Workflows:
.github/workflows/daily-grafana-otel-instrumentation-advisor.md.github/workflows/daily-otel-instrumentation-advisor.mdAnalysis: Near-duplicate workflows providing identical OTEL instrumentation advice. One appears to be Grafana-specific variant.
Action: Immediate consolidation required - Merge into single workflow with optional Grafana integration.
Cluster 2: Agent Smoke Tests (78% overlap)⚠️
Workflows:
.github/workflows/smoke-agent-all-merged.md.github/workflows/smoke-agent-all-none.md.github/workflows/smoke-agent-public-none.mdAnalysis: Smoke tests varying only by agent configuration (all-merged vs all-none vs public-none).
Action: Consolidate to parametrized smoke test with configuration matrix.
Cluster 3: Engine Smoke Tests (80% overlap)⚠️
Workflows:
.github/workflows/smoke-crush.md.github/workflows/smoke-gemini.md.github/workflows/smoke-opencode.md.github/workflows/smoke-pi.mdAnalysis: Identical smoke test structure with different engine parameters.
Action: Single parametrized smoke test with engine as input variable.
Hidden Overlap
The precompute reports 186 total merge candidates but only 3 clusters. This suggests:
Recommendation: Lower cluster detection threshold to surface medium-overlap pairs for consolidation review.
Episode-Level Observations
Finding: No episode-level metrics detected in precompute data.
Implications
Missed Opportunities
Potential episode patterns that could reduce fragmentation:
Recommendation
Introduce episode coordination primitives:
Organizational Health Signals
Analysis
Fragmentation (1.0): Perfect fragmentation score indicates workflows don't compose, share context, or coordinate. This is inverse of healthy microservices architecture where composition is key.
Governance Drag (0.91): Near-maximum governance cost from:
Reuse (0.89): High reuse score is misleading - it reflects overlap (80% merge candidates), not healthy composition. True reuse would be shared libraries/actions, not duplicated workflows.
Trust Concentration (0.24): Low trust indicates outcomes are unpredictable. Without telemetry validation, there's no empirical basis for confidence.
Organizational Patterns
The health signals reveal a proliferation anti-pattern:
Root cause: Missing architectural governance and composition primitives.
Deterministic vs Agentic Findings
Precompute (Deterministic) Findings
The deterministic analysis provided objective metrics:
Deterministic strength: Objective metrics free from interpretation bias.
Agent (Agentic) Interpretation
The semantic analysis adds context and causality:
Agentic strength: Pattern recognition and root cause hypothesis.
Complementary Value
The precompute correctly identified all critical issues. The agentic interpretation adds actionable remediation paths.
Validation
Agreement: Both layers agree on critical status and primary issues.
Divergence: None detected - agentic interpretation aligns with deterministic findings.
Confidence: High - objective metrics support interpretive conclusions.
Highest-Value Actions
Immediate (This Week)
🔴 Consolidate OTEL workflows (97% overlap)
daily-grafana-otel-instrumentation-advisor.mdanddaily-otel-instrumentation-advisor.md🔴 Retire 7 low-yield workflows (yield < 0.004)
ace-editor,test-workflow,copilot-pr-prompt-analysis,daily-news,daily-semgrep-scan,daily-sentrux-report,technical-doc-writer🔴 Instrument top 5 workflows with observability
daily-compiler-quality,daily-doc-healer,ab-testing-advisor,copilot-cli-deep-research,daily-agentrx-trace-optimizerShort-term (This Month)
Strategic (This Quarter)
🟡 Establish portfolio governance
🟡 Introduce episode coordination
🟡 Instrument remaining 22 workflows
Success Metrics
Retirement Candidates
The following 7 workflows are recommended for immediate retirement based on low yield (< 0.004) and maximum risk (1.0):
ace-editor.mdab-testing-advisor.mdtest-workflow.mdcopilot-cli-deep-research.mddaily-doc-updater.mddaily-news.mddaily-semgrep-scan.mdRetirement Process
archived/directory (don't delete)Expected Impact
Consolidation Opportunities
Theme-Based Consolidation
daily-grafana-otel-*+daily-otel-*→otel-instrumentation-advisor.mdsmoke-test-suite.md(parametrized)doc-updater,doc-healer,technical-doc-writer→doc-manager.md(modes)copilot-pr-*workflows →pr-analyzer.md(multi-aspect)daily-*-report→reporting-framework.md(configurable)Parametrization Strategy
Convert workflow families to single parametrized workflows:
Before (4 workflows):
After (1 workflow):
Mode-Based Consolidation
Convert similar workflows to multi-mode single workflow:
Before (3 workflows):
daily-doc-updater.md(update existing docs)daily-doc-healer.md(fix doc errors)technical-doc-writer.md(write new docs)After (1 workflow):
Expected Consolidation Impact
Instrumentation Gaps
Current State
Root Causes
observability:but don't emit tracesCritical Gaps by Category
Instrumentation Requirements
Every workflow should emit:
Execution metrics (already captured by gh-aw runtime):
Outcome metrics (workflow-specific):
Portfolio metrics (aggregated):
Remediation Plan
Phase 1: Restore telemetry pipeline (Week 1)
Phase 2: Instrument high-value workflows (Weeks 2-4)
Phase 3: Mandatory instrumentation (Month 2)
Deterministic Portfolio JSON
The complete precomputed portfolio analysis is available at:
Size: 7.8 MB (199,268 lines)
Structure:
{ "workflows": [...], // 233 workflow metrics "portfolio_metrics": {...}, // Aggregate scores "overlap_clusters": [...], // 3 detected clusters "overlap_pairs": [...], // Pairwise overlap matrix "recommendations_seed": {...}, // 233 recommendations "telemetry_coverage": {...}, // Coverage analysis "organizational_health_signals": {...}, // Health metrics "episode_metrics": [] // (empty - no episodes) }Key Sections for Review:
.portfolio_metrics- Overall portfolio health scores.workflows[] | sort_by(.yield)- Yield-ranked workflow list.overlap_clusters- Consolidation opportunities.recommendations_seed- Categorized recommendations.organizational_health_signals- Governance metricsAccess:
The precompute JSON is attached to this workflow run as an artifact and available in the workflow runner at
/tmp/aw-yield-precompute.json.Metadata
aw-portfolio-yieldworkflow/tmp/aw-yield-precompute.json(7.8 MB)/tmp/gh-aw/portfolio-yield-agent.json