fix(compile): suppress empty AGENTS.md shells for copilot when instructions already in .github/instructions/#1742
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes issue #1730 by preventing apm compile --target copilot from writing header/footer-only AGENTS.md “shell” files when instructions are already deployed to .github/instructions/, while also enabling --clean to remove previously-generated empty shells safely (marker-gated so hand-authored files are preserved).
Changes:
- Suppress writing would-be-empty
AGENTS.mdplacements (and expose suppressed paths in results) whenskip_instructions=Trueand no constitution would be injected. - Marker-gate orphan detection/cleanup so
--cleanonly deletes APM-generatedAGENTS.mdfiles, and can remove stale empty shells. - Add acceptance/unit tests, update docs and changelog, and adjust formatter/logging to avoid contradictory output when everything is suppressed.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/compilation/test_empty_agents_shells_1730.py | New acceptance + unit tests for suppression behavior, logging, and --clean semantics. |
| tests/unit/compilation/test_distributed_compiler_phase3.py | Updates orphan/cleanup tests to use the APM marker and adds hand-authored protection coverage. |
| tests/unit/compilation/test_distributed_compiler_hermetic.py | Mirrors phase3 updates; adds coverage for suppressed_empty_paths affecting --clean. |
| src/apm_cli/compilation/distributed_compiler.py | Implements suppression logic, introduces generated-marker constant, and marker-gates orphan cleanup. |
| src/apm_cli/compilation/agents_compiler.py | Passes with_constitution, suppresses formatter output when fully suppressed, and emits INFO message. |
| docs/src/content/docs/producer/compile.md | Documents Copilot dedup behavior now omitting AGENTS.md when it would be redundant. |
| CHANGELOG.md | Records the fix and the new --clean behavior for empty shells + hand-authored protection. |
- Decide empty-shell suppression before generating content, skipping content generation + link resolution for placements that will be suppressed (perf: avoids wasted work). - Cache constitution-existence per directory across placements in compile_distributed() to avoid repeated disk reads; _is_placement_empty_shell consults the cache. - Promote the generated-file marker to public AGENTS_MD_GENERATED_MARKER (was _AGENTS_MD_GENERATED_MARKER) since it gates deletion and is a contract surface depended on by tests; mirrors public CLAUDE_HEADER. - Fix _cleanup_orphaned_files docstring to match debug-only/no-warning behavior for hand-authored skips. - Make the INFO-message test hermetic by writing the source instruction file on disk.
…ctions already in .github/instructions/ (closes microsoft#1730) When `apm compile -t copilot` runs after `apm install` has deployed instructions to `.github/instructions/`, the distributed compiler now detects that each AGENTS.md placement would be an empty shell (header + footer only, no instruction body) and suppresses writing those files. Key changes: - `distributed_compiler.py`: Phase 3b content-map pass checks `_is_placement_empty_shell()` for each placement when `skip_instructions=True`; constitution presence bypasses suppression since injection would add real body content; `_find_orphaned_agents_files` and `_cleanup_orphaned_files` now gate on `_AGENTS_MD_GENERATED_MARKER` to protect hand-authored AGENTS.md files; `CompilationResult` gains `suppressed_empty_paths` field. - `agents_compiler.py`: Emits INFO-level user log when suppressed paths exist: "AGENTS.md not generated -- Copilot reads `.github/instructions/` directly, no further action needed". - `--clean` removes pre-existing APM-generated empty shells via marker gate; hand-authored files are never deleted. - Plain `apm compile` (no `--clean`) is non-destructive: suppresses new writes only, leaves any pre-existing files untouched. - Multi-target builds (Codex/OpenCode/Windsurf/Gemini) are unaffected because `skip_instructions` is False for those targets. - `--no-dedup` / `--force-instructions` still produces full AGENTS.md. - 16 new acceptance tests in test_empty_agents_shells_1730.py; updated hermetic and phase3 test suites to use APM marker in orphan/cleanup tests. CHANGELOG entry references microsoft#1730, microsoft#1138, microsoft#1550.
…hell suppression Addresses all REQUIRED findings and cheap nits from the 6-persona review: 1. Fix contradictory terminal output (devx-ux): guard the placement-table formatter in _compile_distributed so it is suppressed when all placements are suppressed (suppressed_empty_paths non-empty AND content_map empty). INFO 'not generated' message still fires unconditionally. 2. Fix --no-constitution predicate/writer disagreement + model-based rewrite (primitives-architect + python-architect): thread with_constitution from CompilationConfig -> distributed_config -> compile_distributed -> _is_placement_empty_shell. Rewrite _is_placement_empty_shell to use the model (constitution check gated by with_constitution flag) rather than parsing rendered content strings. Drop the content parameter. Add tests for the disagreement case (with_constitution=False + constitution on disk). 3. Use _AGENTS_MD_GENERATED_MARKER constant at injection site (supply-chain): replace the hardcoded literal in _generate_agents_content with the constant. 4. Update docs/src/content/docs/producer/compile.md (devx-ux): correct the Copilot deduplication note to say AGENTS.md is omitted entirely when it would only carry instructions (not "still generated"); document that it is still written when carrying non-instruction content (constitution). 5. Fix two misleading comments in distributed_compiler.py (nit): clarify the placement.agents dead-code note and correct the generated_paths comment. 6. Disambiguate partial-suppression INFO message (devx-ux nit): use 'subdir placements not generated' when some AGENTS.md files were written and others suppressed, versus 'not generated' for full suppression. 7. Demote defense-in-depth hand-authored skip to _logger.debug (cli-logging nit): unreachable in normal operation; remove the user-facing [!] warning that would double-prefix and surface as noise. Update two tests accordingly.
- Decide empty-shell suppression before generating content, skipping content generation + link resolution for placements that will be suppressed (perf: avoids wasted work). - Cache constitution-existence per directory across placements in compile_distributed() to avoid repeated disk reads; _is_placement_empty_shell consults the cache. - Promote the generated-file marker to public AGENTS_MD_GENERATED_MARKER (was _AGENTS_MD_GENERATED_MARKER) since it gates deletion and is a contract surface depended on by tests; mirrors public CLAUDE_HEADER. - Fix _cleanup_orphaned_files docstring to match debug-only/no-warning behavior for hand-authored skips. - Make the INFO-message test hermetic by writing the source instruction file on disk.
3c687e5 to
ab3888b
Compare
The lockfile was inadvertently rewritten to point at an internal PyPI mirror (nexus-ci.onefiserv.net) instead of canonical PyPI. No dependency changed in this PR, so restore uv.lock to match main.
A: Extract _file_has_apm_marker() helper that reads only the first 4096
bytes of each candidate AGENTS.md; use it in both _find_orphaned_agents_files
and _cleanup_orphaned_files (DRY, bounded I/O).
B: Replace hard-coded /tmp/{name}.instructions.md default in
_make_instruction() with tempfile.gettempdir()-based path for
cross-platform portability.
C: project_with_stale_shell fixture now creates .apm/instructions/ and
writes the instruction source file, making it hermetic against future
filesystem validation.
D: test_no_placement_table_when_fully_suppressed does the same for its
inline instruction reference (also runs full compile() path).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 10 comments.
Comments suppressed due to low confidence (1)
src/apm_cli/compilation/distributed_compiler.py:1
- With the new suppression behavior,
placementscan be non-empty whilecontent_mapis empty (fully suppressed). Any downstream stats/UX that treatlen(placements)as 'files generated' will now be wrong unless they’re updated to uselen(content_map)(or a separateplacements_countvsfiles_written/files_generatedsplit). Concretely, consider updating the distributed stats (and any caller summaries) to report the number of files that will actually be written (len(content_map)) and optionally add a separate stat for total placements analyzed.
"""Distributed AGENTS.md compilation system following the Minimal Context Principle.
Issue 1 (stats): _compile_distributed_stats now accepts content_map; sets agents_files_generated = len(content_map) so suppressed empty-shell placements are not counted, and adds placements_analyzed = len(placements) to preserve the total-placements signal. The two downstream overrides in agents_compiler.py (_compile_distributed dry-run and write paths) remain correct -- they already update the stat after writes complete. Issue 2 (docstring): _find_orphaned_agents_files param docstring for generated_paths clarifies it holds ALL placement paths (written OR suppressed), not just files written to disk, to prevent future misuse. Issue 3 (message): partial-suppression INFO log in _compile_distributed changed from "AGENTS.md subdir placements not generated" to "Some AGENTS.md placements not generated" -- the root AGENTS.md can also be suppressed (issue microsoft#1730 case A), so "subdir" was misleading. Issue 4 (fixtures): test_hand_authored_not_deleted_by_clean and project_with_github_instructions in test_empty_agents_shells_1730.py now create .apm/instructions/ and write the source instruction file on disk before constructing the Instruction object, matching the hermetic pattern used by other fixtures in this module. Tests updated: test_basic_stats_keys_present and test_stats_counts_placements in both hermetic and phase3 stats test classes now assert placements_analyzed is present and verify the new docstring.
…ression Comment T (test_empty_agents_shells_1730.py): remove tempfile.gettempdir() from _make_instruction helper; require either file_path or tmp_path so each test invocation gets a unique, worktree-isolated path rather than a shared system temp filename. Removed the now-unused `import tempfile`. Updated the one call site that omitted both (TestIsPlacementEmptyShell line ~595) to pass tmp_path. Comment U (distributed_compiler.py _cleanup_orphaned_files): moved `rel_path = portable_relpath(...)` inside the try block so a failure there is caught by the existing except handler. Added `rel_path: str | None = None` initializer before the try; except now uses `rel_path if rel_path is not None else str(file_path)` as the display name so the message is always informative even if portable_relpath itself raised. Comment V (same function): track `removed_count`; append " 0 files removed (all skipped)" when the loop exits with nothing removed (all hand-authored skip path). Added test_all_hand_authored_reports_zero_removed to both test_distributed_compiler_hermetic.py and test_distributed_compiler_phase3.py asserting the new honest summary.
A (src): replace read_constitution() presence check with find_constitution().is_file()
in _is_placement_empty_shell — avoids unnecessary file reads for a boolean check.
Import swapped from read_constitution to find_constitution accordingly.
B/C (tests): clarify TestCaseAGlobalInstructions class docstring and
test_case_a_no_root_agents_md_written docstring to name both "copilot target
(internal alias `vscode`)" so readers aren't confused by the target="vscode" config value.
D (tests): add no-op warning/error/success methods to both _MockLogger stubs
(TestInfoLogMessage and TestFormatterGuardFullySuppressed) so an AttributeError
from future _log("warning"/"error") calls doesn't mask the real failure.
Also update TestIsPlacementEmptyShell to construct real constitution files on disk
(via _write_constitution helper) instead of monkeypatching read_constitution, to
exercise the new find_constitution(...).is_file() code path directly.
Fold shepherd panel follow-ups for PR microsoft#1742: move all-suppressed state onto the distributed compilation result, tighten Copilot suppression output with the --no-dedup escape hatch, correct docs and changelog drift, and make the integration dedup test assert the new no-AGENTS contract directly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fold the final CLI logging nit by naming the suppressed AGENTS.md files in the partial-suppression escape-hatch message. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 0 | 2 | Suppression logic is well structured; remaining notes are polish only. |
| CLI Logging Expert | 0 | 0 | 0 | Suppression messages are actionable and use info semantics. |
| DevX UX Expert | 0 | 0 | 0 | The --no-dedup escape hatch is discoverable in output and docs. |
| Supply Chain Security Expert | 0 | 0 | 0 | Marker-gated deletion preserves hand-authored AGENTS.md files. |
| OSS Growth Hacker | 0 | 0 | 0 | CHANGELOG and docs are now concise and user-facing. |
| Doc Writer | 0 | 0 | 1 | Cross-page Copilot dedup wording is accurate; one optional shortening nit remains. |
| Test Coverage Expert | 0 | 0 | 0 | Regression traps cover the #1730 promise at unit and integration tiers. |
Counts are signal strength, not gates. The maintainer ships.
Recommendation
Ship now. The fold commits addressed the in-scope panel findings, local lint and targeted tests passed, mutation-break evidence was collected for the new integration assertion, and GitHub CI is green on the latest head.
Folded in this run
- (panel) Added
CompilationResult.all_suppressedand used it from the orchestrator instead of re-deriving suppression state -- resolved ind582b122. - (panel) Added the
--no-dedupescape hatch to Copilot suppression INFO output -- resolved ind582b122. - (panel) Corrected Copilot dedup documentation in
ide-tool-integration.md,CHANGELOG.md, andpackages/apm-guide-- resolved ind582b122. - (panel) Made the integration dedup test assert that no
AGENTS.mdfile is written after install populates.github/instructions/-- resolved ind582b122. - (panel) Clarified the partial-suppression override message to name all suppressed
AGENTS.mdfiles -- resolved in458ded64.
Regression-trap evidence (mutation-break gate)
tests/integration/test_install_compile_copilot_dedup_e2e.py::test_install_then_compile_skips_duplicated_instructions-- deletedDistributedAgentsCompiler.compile_distributedsuppression guard; test FAILED as expected; guard restored.
Lint contract
uv run --extra dev ruff check src/ tests/ and
uv run --extra dev ruff format --check src/ tests/ both silent.
CI
gh pr checks 1742 --repo microsoft/apm --watch completed green on head 458ded6494ad0401b8407ec9caa267404ff3f81d; CI run https://github.com/microsoft/apm/actions/runs/27362820294 (after 0 CI fix iteration(s)).
Mergeability status
Captured from gh pr view 1742 --json mergeable,mergeStateStatus,statusCheckRollup immediately after the last push of this run.
| PR | head SHA | CEO stance | iters | folds | defers | Copilot rounds | CI | mergeable | mergeStateStatus | notes |
|---|---|---|---|---|---|---|---|---|---|---|
| #1742 | 458ded6 |
ship_now | 2 | 5 | 0 | 2 | green | MERGEABLE | BLOCKED | pending maintainer review |
Convergence
2 outer iteration(s); 2 Copilot round(s). Final panel verdict: ship_now.
Ready for maintainer review.
Full per-persona findings
Python Architect
- [nit]
all_suppressedcould be named more explicitly for future callers. Current use is correct. - [nit] The
placement.agentsdead-code cleanup is correct; future similar cleanups can be separate commits for bisect clarity.
CLI Logging Expert
No findings after the final suppression-message clarification.
DevX UX Expert
No findings.
Supply Chain Security Expert
No findings.
OSS Growth Hacker
No findings.
Auth Expert -- inactive
PR touches compilation code, compile docs/changelog, apm-guide command docs, and compilation tests; no auth/token/credential/host/fallback surface changed.
Doc Writer
- [nit]
ide-tool-integration.mdcould shorten its Copilot dedup block further now that it links to the canonical compile page. Current wording is accurate.
Test Coverage Expert
No findings.
Performance Expert -- inactive
PR touches compile/codegen paths, not dependency fetch, resolve, materialize, cache, or verify hot paths.
This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.
…ges (#1757) * chore: release v0.20.0 Bump pyproject.toml + uv.lock to 0.20.0 and roll the [Unreleased] CHANGELOG block into [0.20.0] - 2026-06-11. Lint mirror green locally (ruff check + format, pylint R0801, auth-signals). Post-merge: tag v0.20.0 to trigger the release workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(integration): refresh stale integration tests after v0.20.0 changes The release integration suite (which gates releases, not per-PR runs) surfaced 29 failures after v0.20.0 was tagged. Each was a stale test that had not been updated to match an intentional source behavior change merged this cycle. Source is correct throughout; only tests change here. Clusters and causing PRs: - A (#1735): download_github_file now calls _host._resolve_dep_auth_ctx; mock host helpers never stubbed it, forcing the wrong auth path. Added the stub to make_host/_make_host in 4 download test files. Also #1735 made generic raw-URL network errors raise RuntimeError instead of falling back to the Contents API -- rewrote 2 stale fallback tests to assert the raise. - B (#1742): apm compile -t copilot suppresses empty AGENTS.md shells when .github/instructions/*.md exists. Flipped 3 assertions to expect no AGENTS.md and assert the instruction files are present. - C (#1739): marketplace fetches moved to a shared requests.Session (_HTTP_SESSION.get); updated 6 tests to mock that seam. - D (#1734): optional registry env inputs are omitted without an override; updated 2 placeholder tests to assert the var is absent. - E (#1734): same omit-optional change; assert var not in result. - F (#1720): tar symlink rejection wording changed; updated the regex. Verified locally: 738 passed, 16 skipped (token-gated e2e). The two runnable cluster-B e2e tests (mixed_deps, guardrailing) pass against the v0.20.0 build; ado_e2e is identical logic, validated in CI. Full lint mirror green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r changes (#1758) * chore: release v0.20.0 Bump pyproject.toml + uv.lock to 0.20.0 and roll the [Unreleased] CHANGELOG block into [0.20.0] - 2026-06-11. Lint mirror green locally (ruff check + format, pylint R0801, auth-signals). Post-merge: tag v0.20.0 to trigger the release workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(integration): refresh stale integration tests after v0.20.0 changes The release integration suite (which gates releases, not per-PR runs) surfaced 29 failures after v0.20.0 was tagged. Each was a stale test that had not been updated to match an intentional source behavior change merged this cycle. Source is correct throughout; only tests change here. Clusters and causing PRs: - A (#1735): download_github_file now calls _host._resolve_dep_auth_ctx; mock host helpers never stubbed it, forcing the wrong auth path. Added the stub to make_host/_make_host in 4 download test files. Also #1735 made generic raw-URL network errors raise RuntimeError instead of falling back to the Contents API -- rewrote 2 stale fallback tests to assert the raise. - B (#1742): apm compile -t copilot suppresses empty AGENTS.md shells when .github/instructions/*.md exists. Flipped 3 assertions to expect no AGENTS.md and assert the instruction files are present. - C (#1739): marketplace fetches moved to a shared requests.Session (_HTTP_SESSION.get); updated 6 tests to mock that seam. - D (#1734): optional registry env inputs are omitted without an override; updated 2 placeholder tests to assert the var is absent. - E (#1734): same omit-optional change; assert var not in result. - F (#1720): tar symlink rejection wording changed; updated the regex. Verified locally: 738 passed, 16 skipped (token-gated e2e). The two runnable cluster-B e2e tests (mixed_deps, guardrailing) pass against the v0.20.0 build; ado_e2e is identical logic, validated in CI. Full lint mirror green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(release): refresh release-validation harness for v0.20.0 behavior changes The release-validation shell harness carries its own copies of behavior assertions that duplicate the integration suite. Two of them went stale this cycle from the same PRs that broke the integration tests (#1757): - GH-AW compat (#1720): `apm pack --archive` now emits .zip by default; the archive check grepped only `build/*.tar.gz`. Accept either extension, testing each glob independently (a single `ls a b` exits non-zero when either pattern is unmatched, even if the other matches). - Hero scenario 2 / AGENTS.md (#1742): copilot `apm compile` omits the empty AGENTS.md shell when installed instructions already live under `.github/instructions/`. The check insisted AGENTS.md exist; now accept AGENTS.md OR a populated `.github/instructions/`, mirroring the merged pytest fix in test_guardrailing_hero_e2e.py. Same fixes applied to the Windows .ps1 (AGENTS.md only; it has no archive check). Predicates validated locally against apm v0.20.0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
apm compile --target copilotno longer writes content-freeAGENTS.mdshells when instructions have already been deployed to.github/instructions/.TL;DR — After
apm installpopulates.github/instructions/, the instruction-dedup logic from #1550 correctly omits instruction bodies fromAGENTS.md, but still wrote header-and-footer-only files — for the root placement and every distributed subdir placement (docs/AGENTS.md,src/AGENTS.md, …). These ~233-byte shells are pure noise. This PR suppresses any placement whose only content would have been the now-elided instructions, emits an INFO-style message explaining why, and teachesapm compile --cleanto remove pre-existing APM-generated empty shells (hand-authored files are never touched).Fixes #1730 (related: #1138, #1550)
Problem
can_dedup_agents_md_instructions()returnsTrueonly when the sole AGENTS.md consumer is Copilot (target set{vscode}). In that case_generate_agents_content()was gating just the instruction body onskip_instructions, so every placement still produced a non-empty string (header + footer) and was written:AGENTS.mdat the project root.applyTo-scoped instructions, one empty shell per subdirectory (docs/AGENTS.md,src/AGENTS.md, …).The CLAUDE.md path already suppressed the analogous empty file (#1138); the Copilot/AGENTS.md path ported the skip logic but not the suppression.
--cleandid not help either: the shells are regenerated every run, so they were never classified as orphans.Approach
Mirror and extend the CLAUDE.md behaviour on the distributed AGENTS.md path:
skip_instructionsis active, drops would-be-empty placements into asuppressed_empty_pathslist instead ofcontent_map, so they are never written._is_placement_empty_shell()is model-based (no brittle header/footer string parsing) andwith_constitution-aware, so the predicate agrees with the writer under--no-constitution.--cleanremoves stale shells. Suppressed paths are promoted to orphan candidates under--clean, gated on the APM-generated marker so hand-authoredAGENTS.mdfiles are never deleted.Implementation
flowchart TD A[apm compile -t copilot] --> B{can_dedup_agents_md_instructions?} B -- no (Codex/OpenCode/Windsurf/Gemini/multi) --> F[skip_instructions = False<br/>full AGENTS.md written] B -- yes, .github/instructions populated --> C[skip_instructions = True] C --> D[Phase 3b: per placement] D --> E{_is_placement_empty_shell<br/>with_constitution-aware?} E -- empty shell --> G[suppressed_empty_paths<br/>not written] E -- has constitution / content --> H[content_map: written] G --> I{--clean?} I -- yes --> J[remove stale APM-generated<br/>shells, marker-gated] I -- no --> K[non-destructive: leave existing files]Key points:
skip_instructionsis alreadyTrue, whichcan_dedup_agents_md_instructions()restricts to the Copilot-only set. Codex / OpenCode / Windsurf / Gemini and any multi-target build are provably unaffected (test-covered).--no-dedup/--force-instructionsstill produce a fullAGENTS.md.AGENTS_MD_GENERATED_MARKER) used at the write site and both--cleangate sites, so write-marker and check-marker can never drift. It is public (mirroringCLAUDE_HEADER) because it gates deletion and is a stable contract surface.compile_distributed()memoizes constitution existence per directory (dict[Path, bool]), so repos with many placements under a shared tree read each constitution at most once per compile rather than once per placement.Type of change
docs/src/content/docs/producer/compile.mddedup note)Testing
Evidence (run against this branch's
src/):ruff check src/ tests/→ All checks passedruff format --check src/ tests/→ 1235 files already formattedpytest tests/unit/compilation/→ 1135 passedNew tests (
tests/unit/compilation/test_empty_agents_shells_1730.py, plus updates to the hermetic/phase3 suites) cover: Case A (root) and Case B (distributed subdirs) producing no empty file; the INFO message;--cleanremoving a stale APM-generated shell while leaving hand-authored files; constitution-bearing placements still written; multi-target builds unaffected;--no-dedupstill full; and thewith_constitution=Falsepredicate/writer agreement. The Case A/B and--cleantests assert filesystem state (mutation-resistant — they fail if the suppression guard is reverted).How to test manually
Spec conformance (OpenAPM v0.1)
src/apm_cli/compilation/(not an OpenAPM critical path) and alters only compile-timeAGENTS.mdemission; no manifest, lockfile, resolution, policy, or security behaviour changes.