feat(deps): fetch path-scoped files over git transport instead of host API#1740
Conversation
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 2 | 1 | Clean module seam; narrow the broad except so containment errors propagate. |
| CLI Logging Expert | 0 | 1 | 1 | git->REST fallback is silent; surface it for 410 debugging. |
| DevX UX Expert | 0 | 1 | 1 | #872 error is excellent; heuristic can false-positive on primitive-named subgroups. |
| Supply Chain Security | 0 | 1 | 0 | Containment solid and auth URL not leaked; don't swallow PathTraversalError. |
| OSS Growth Hacker | 0 | 0 | 1 | Strong contributor credit + clear CHANGELOG; amplify the 410 fix. |
| Test Coverage Expert | 0 | 1 | 0 | Sparse-fetch + #872 regressions verified present; add two coupled guards. |
| Auth Expert | 0 | 0 | 1 | git-first default preserved; token precedence and git_env reuse clean. |
| Doc Writer | 0 | 0 | 1 | All four doc surfaces + CHANGELOG updated; trim trailing blank lines. |
| Performance Expert | 0 | 1 | 0 | blob:none + depth=1 + cone correct; watch O(files) for same-repo multi-fetch. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 5 follow-ups
- [Supply Chain Security] Narrow the
except Exceptionindownload_gitlab_filesoPathTraversalError/ validation errors propagate instead of silently falling through to the REST transport. -- A containment guard should never be reinterpreted as "git transport unavailable"; keep defense-in-depth even though the REST path encodes the path today. - [DevX UX Expert] Tighten the Support installing individual skills/agents from monorepo packages (including private repo support) #872 heuristic: a GitLab subgroup literally named after a primitive (e.g.
git@gitlab.com:group/skills/repo.git) trips_check_no_embedded_subpath. -- Would wrongly fail a legitimate dep; consider also requiring the trailing segment to look like a primitive, or document the limitation. - [Test Coverage Expert] Add two regression tests: PathTraversalError must not be swallowed into the REST fallback, and the subgroup-named-primitive shape must parse cleanly. -- Locks follow-ups 1 and 2 against silent regression.
- [CLI Logging Expert] Emit an info/verbose note when the git transport fails and REST serves the file. -- The git failure is currently
_debug-only, hiding which transport answered and complicating 410 triage. - [Performance Expert] Track batching multiple same-repo
path:fetches into a single sparse cone if multi-file deps become common. -- Per-file fetch is O(files) fresh git inits today; the board accepted no-caching for the single-file case, so this is a watch item, not a change.
Architecture
classDiagram
class DownloadDelegate {
+download_gitlab_file(dep_ref, file_path, ref) bytes
}
class git_file_transport {
+fetch_file_via_git_sparse(dep_ref, file_path, ref) bytes
}
class DependencyReference {
+parse(dependency_str)
-_check_no_embedded_subpath(url)
-_APM_PRIMITIVE_DIRS
}
class path_security {
+validate_path_segments(path)
+ensure_path_within(target, root)
}
DownloadDelegate ..> git_file_transport : git-first
DownloadDelegate ..> path_security : (REST path encodes)
git_file_transport ..> path_security : containment
DependencyReference ..> path_security : segment validation
flowchart TD
A[download_gitlab_file] --> B{git transport}
B -->|fetch_file_via_git_sparse| C[validate_path_segments before git]
C --> D[git init + sparse-checkout set parent]
D --> E[git fetch --filter=blob:none --depth=1]
E --> F[git checkout FETCH_HEAD]
F --> G[ensure_path_within target work_dir]
G --> H[return file bytes]
B -->|Exception caught| I[REST v4 fallback with GITLAB_APM_PAT]
I --> J[return bytes or RuntimeError]
sequenceDiagram
participant D as DownloadDelegate
participant G as git_file_transport
participant S as path_security
D->>G: fetch_file_via_git_sparse(dep_ref, file_path, ref)
G->>S: validate_path_segments(file_path)
G->>G: git init / remote add / sparse-checkout set
G->>G: git fetch --filter=blob:none --depth=1
G->>G: git checkout FETCH_HEAD
G->>S: ensure_path_within(target, work_dir)
G-->>D: file bytes
Recommendation
Ship after a light pass on the follow-ups. The PR fulfills the board brief end to end with strong test coverage and correct, unconditional containment. Recommended next action: narrow the fallback except so PathTraversalError propagates (follow-up 1) and add the two coupled regression tests (follow-up 3); the DevX false-positive (follow-up 2) and the logging/perf items can land in the same pass or as fast follow. None of these block merge -- the maintainer performs the protected-branch merge once #1571/#1732 lands and CI retargets to main.
Full per-persona findings
Python Architect
- [recommended] Broad
except Exception as git_errindownload_gitlab_fileswallows the git path's validation errors and falls through to REST atsrc/apm_cli/deps/download_strategies.py
The fallback intent is "git transport unavailable -> try API", butPathTraversalErrorfromvalidate_path_segments()and containment failures are also caught, conflating a security-validation outcome with a transport-availability outcome.
Suggested: Catch a narrower set (subprocess/RuntimeError/OSError) or re-raisePathTraversalErrorbefore entering the REST branch. - [recommended]
fetch_file_via_git_sparsereachesself._host._build_repo_url(a private member) through the injectedbuild_repo_url_fnatsrc/apm_cli/deps/git_file_transport.py
The callable-injection cleanly avoids a circular import, which is the right call; the residual smell is binding to a private method. A small public seam on the host would make the contract explicit.
Suggested: Expose a thin publicbuild_repo_url(...)wrapper and pass that. - [nit]
work_dir.mkdir(exist_ok=True)under a freshly createdmkdtempprefix never needsexist_okatsrc/apm_cli/deps/git_file_transport.py
Harmless, butexist_ok=Truehides an unexpected collision in the unique temp dir.
CLI Logging Expert
- [recommended] The git->REST fallback is observable only at
_debugatsrc/apm_cli/deps/download_strategies.py
When git transport fails and REST succeeds, the user sees the normal success message with no indication that a fallback occurred -- which is precisely the signal a self-hosted-GitLab operator needs when diagnosing transport selection.
Suggested: Emit a[i]/verbose line such as "git transport unavailable; using GitLab REST API fallback" before the REST attempt. - [nit] Success messages diverge between transports ("Downloaded file via git: ..." vs "Downloaded file: ...") at
src/apm_cli/deps/download_strategies.py
Reasonable and within ASCII conventions; consider including the ref in the git-path message for parity.
DevX UX Expert
- [recommended] The Support installing individual skills/agents from monorepo packages (including private repo support) #872 heuristic can false-positive on a GitLab subgroup literally named after a primitive at
src/apm_cli/models/dependency/reference.py
_check_no_embedded_subpathflags any non-last segment matching an APM primitive name; a legitimategit@gitlab.com:group/skills/repo.git(subgroup named "skills") has interior segment "skills" and would be wrongly failed.
Suggested: Also require the trailing segment(s) to resemble a primitive target (e.g.<primitive>/<name>), or document the constraint in the error/docs. - [nit] Friendly error is well-formed at
src/apm_cli/models/dependency/reference.py
Uses the[x]status symbol, names both thepath:key and the bare shorthand, and echoes the offending input via{raw!r}. This is the model the repo wants for actionable parse errors.
Supply Chain Security
- [recommended] Containment guard demoted by the fallback's broad
exceptatsrc/apm_cli/deps/download_strategies.py
Positive: containment is unconditional on the git path (validate_path_segmentspre-git,ensure_path_withinpost-checkout, both tested), and the error string usescmd[:3]so the auth-embedded clone URL is never logged. The gap is that the same broadexceptthat enables REST fallback also swallowsPathTraversalError. Not exploitable today (the REST URLquote()-encodes the path into one inert component), but a validation failure should not be silently reinterpreted as a transport miss.
Suggested: Re-raisePathTraversalError(and other containment errors) before falling back.
OSS Growth Hacker
- [nit] Strong adoption hygiene at
CHANGELOG.md
Clear "so what" CHANGELOG entry, explicit@cossonscredit for root-cause + design, and a pre-merge reporter-validation ask. Worth converting the self-hosted-GitLab 410 fix into a release highlight and a reply on the originating thread.
Test Coverage Expert
- [recommended] Two coupled regression tests are missing
Verified present and solid (inspected the diff): the 12-testtests/test_gitlab_git_transport.pysparse-fetch/containment/fallback suite and thetests/unit/test_generic_git_urls.py::TestEmbeddedSubpathInGitUrlSupport installing individual skills/agents from monorepo packages (including private repo support) #872 suite (which correctly asserts on the parsed message via regex, not a URL substring, and includes no-false-positive cases). Gaps: (1) no test thatdownload_gitlab_filedoes NOT swallow aPathTraversalErrorinto the REST fallback; (2) no no-regression test for a GitLab subgroup named after a primitive.
Suggested: Add both; they pin the supply-chain and DevX follow-ups.
Auth Expert
- [nit] Token model is clean at
src/apm_cli/deps/download_strategies.py
git-first remains the default; the REST fallback resolves via the existingauth_resolverwithGITLAB_APM_PAT->GITLAB_TOKEN->git credential fillprecedence;git_envmergesos.environwithself._host.git_envso no new credential or origin is introduced. Worth a quick confirm that resolver order matches theGITLAB_TOKEN"lower-precedence" claim now documented in the skill resource.
Doc Writer
- [nit] Documentation coverage is complete at
docs/src/content/docs/consumer/authentication.md
Both consumer docs and bothpackages/apm-guide/.apm/skills/apm-usage/resources (authentication + dependencies) are updated alongside the CHANGELOG, satisfying the docs-sync instruction. Minor: trim the double trailing blank lines added inapm-usage/authentication.md.
Performance Expert
- [recommended] Per-file fetch is correct but unbatched at
src/apm_cli/deps/git_file_transport.py
blob:none + depth=1 + cone sparse-checkout is the right hot-path shape and is genuinely not a full clone (root files correctly skip cone). The watch item: eachpath:file from the same repo spawns a freshgit init+ fetch into a new temp dir, so a multi-file same-repo dep is O(files) inits. The board accepted no caching for the single-file case; revisit only if multi-file deps become common.
Suggested: If needed later, set multiple sparse paths in one cone and checkout once.
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
Two panel follow-ups on the #1014 GitLab git-transport surface. Supply-chain: download_gitlab_file caught a bare `except Exception` around the git-transport attempt, so a PathTraversalError raised by the path-validation / symlink-containment guards in fetch_file_via_git_sparse was silently swallowed and the request fell through to the REST transport. A rejected traversal must not gain a second transport to probe. PathTraversalError now propagates unchanged; only genuine transport failures fall back to REST. CLI logging: the git->REST fallback transition was `_debug`-only, so triagers had no signal which transport answered a 410. The fallback now emits a verbose `[i]` note carrying the git failure reason, and the REST success notes attribute the GitLab REST API explicitly. Adds a regression test asserting a PathTraversalError in the git path is not swallowed into the REST fallback (mutation-break verified). Updates existing verbose-callback tests that exercised the unmocked-git fallback path to the new transport-attributed messages. addresses CEO follow-up (Supply Chain) and (CLI Logging) on PR #1740 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f6cb096 to
a03ac65
Compare
The #872 guard `_check_no_embedded_subpath` flagged any non-last path segment matching an APM primitive directory name, which produced two false positives on legitimate URLs: - A GitLab subgroup literally named after a primitive, e.g. `git@gitlab.com:group/skills/repo.git`, where `skills` is a subgroup at index 1 and `repo` is the real repository. - Azure DevOps virtual-path URLs, e.g. `https://dev.azure.com/org/proj/_git/repo/instructions/x`, where the primitive segment is the supported ADO virtual path after `_git/repo`. The embedded-subpath shape is `org/repo` + `<primitive>/<name>`, so a primitive segment now only fires when preceded by a complete org/repo prefix (index >= 2), and ADO URLs (identified by the `_git` marker) are skipped entirely. The canonical malformed form `org/repo/skills/<name>` that #872 targets still hard-fails; no GitHub or GitLab repo path uses `_git`, so real detection is unaffected. A residual deep-subgroup ambiguity (`group/sub/skills/repo`) is documented in code. Adds a regression test for the primitive-named subgroup (mutation-break verified); the ADO false positive is covered by the existing TestParseAdoUrls suite. addresses CEO follow-up (DevX) on PR #1740; Refs #872 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Shepherd-driver terminal advisoryCEO stance carried from the prior panel pass: Conflict resolved against mainPR was
CHANGELOG keeps both entries. Final head Folded in this run
Regression-trap evidence (mutation-break gate)
Copilot signals reviewed
Deferred (out-of-scope follow-up)
Consumer-install confirmationA Lint contract
CIAll required checks green on Mergeability status
Conflict resolved, follow-ups folded, CI green. Ready for maintainer |
There was a problem hiding this comment.
Pull request overview
This PR updates APM’s GitLab path: file download behavior to be git-transport-first (sparse/partial checkout) instead of GitLab’s REST API, fixing self-hosted GitLab instances where the REST API returns HTTP 410. It also adds a parse-time guard that rejects embedded APM primitive subpaths inside explicit git URLs and points users to the path: key.
Changes:
- Add
fetch_file_via_git_sparse()to fetch singlepath:files usinggit fetch --filter=blob:none --depth=1plus sparse checkout (when applicable). - Update GitLab file download strategy to try git transport first, then fall back to GitLab REST v4 API (PAT/token-based) with improved transport attribution in verbose output.
- Add tests + documentation + changelog entries for the new GitLab behavior and the friendly “embedded subpath in git URL” error.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/deps/git_file_transport.py |
New git sparse/partial single-file fetch implementation with containment checks. |
src/apm_cli/deps/download_strategies.py |
Routes GitLab file download through git transport first, REST API fallback second; updates verbose messages. |
src/apm_cli/models/dependency/reference.py |
Adds _check_no_embedded_subpath() guard to reject embedded primitive subpaths in explicit git URL forms. |
tests/test_gitlab_git_transport.py |
New tests covering git-transport-first behavior, traversal rejection, symlink escape, and REST fallback. |
tests/unit/test_generic_git_urls.py |
Updates existing URL-with-path rejection tests and adds new embedded-subpath guard coverage. |
tests/unit/deps/test_download_strategies_selection.py |
Adjusts verbose-callback assertions for GitLab REST attribution. |
tests/unit/deps/test_download_strategies_phase3.py |
Same as above for phase3 suite. |
tests/integration/test_download_strategies_selection.py |
Updates assertions for REST API attribution message in verbose callback. |
tests/integration/test_download_strategies_phase3w5.py |
Same as above for phase3w5 integration suite. |
docs/src/content/docs/consumer/manage-dependencies.md |
Documents GitLab git-transport-first path: fetching with REST fallback. |
docs/src/content/docs/consumer/authentication.md |
Documents GitLab behavior and GITLAB_APM_PAT fallback guidance. |
packages/apm-guide/.apm/skills/apm-usage/dependencies.md |
Updates usage skill docs for GitLab path: transport behavior. |
packages/apm-guide/.apm/skills/apm-usage/authentication.md |
Updates usage skill docs for GitLab auth and fallback behavior. |
CHANGELOG.md |
Adds a user-visible fix entry for GitLab 410 + git-transport-first fetching. |
Copilot's findings
- Files reviewed: 14/14 changed files
- Comments generated: 6
| except Exception as git_err: | ||
| _debug(f"git transport failed for {host}/{dep_ref.repo_url}/{file_path}: {git_err}") | ||
| if verbose_callback: | ||
| verbose_callback( | ||
| f"[i] git transport unavailable for {host}/{dep_ref.repo_url}/{file_path}; " | ||
| f"falling back to GitLab REST API ({git_err})" | ||
| ) |
| d.download_gitlab_file(self._dep(), "apm.yml", verbose_callback=callback) | ||
| callback.assert_called_once() | ||
| # Git transport is unmocked here and fails, so the callback fires for | ||
| # the fallback-transition note AND the REST-API success note. At least | ||
| # one call must attribute the REST API as the transport that answered | ||
| # (transport-attribution for 410 triage). | ||
| assert callback.called | ||
| assert any("REST API" in str(c.args[0]) for c in callback.call_args_list) |
| d.download_gitlab_file(self._dep(), "apm.yml", verbose_callback=callback) | ||
| callback.assert_called_once() | ||
| # Git transport is unmocked here and fails, so the callback fires for | ||
| # the fallback-transition note AND the REST-API success note. At least | ||
| # one call must attribute the REST API as the transport that answered | ||
| # (transport-attribution for 410 triage). | ||
| assert callback.called | ||
| assert any("REST API" in str(c.args[0]) for c in callback.call_args_list) |
| result = delegate.download_gitlab_file(dep, "README.md", verbose_callback=callback) | ||
|
|
||
| assert result == b"ok" | ||
| callback.assert_called_once_with("Downloaded file: gitlab.example.com/group/repo/README.md") | ||
| # Git transport is unmocked and fails, so a fallback-transition note | ||
| # fires before the REST-API success note. The success note attributes | ||
| # the GitLab REST API as the transport that answered (410 triage). | ||
| callback.assert_any_call( | ||
| "[i] Downloaded file via GitLab REST API: gitlab.example.com/group/repo/README.md" | ||
| ) |
| result = delegate.download_gitlab_file(dep, "README.md", verbose_callback=callback) | ||
|
|
||
| assert result == b"ok" | ||
| callback.assert_called_once_with("Downloaded file: gitlab.example.com/group/repo/README.md") | ||
| # Git transport is unmocked and fails, so a fallback-transition note | ||
| # fires before the REST-API success note. The success note attributes | ||
| # the GitLab REST API as the transport that answered (410 triage). | ||
| callback.assert_any_call( | ||
| "[i] Downloaded file via GitLab REST API: gitlab.example.com/group/repo/README.md" | ||
| ) |
| # Create a symlink inside the work dir pointing outside. | ||
| (work_dir / "agents").mkdir() | ||
| (work_dir / "agents" / "evil.agent.md").symlink_to(outside) | ||
|
|
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 1 | 1 | Clean stateless free-function module with proper containment; minor coupling nit on self._host._build_repo_url private-method access. |
| Cli Logging Expert | 0 | 0 | 2 | Verbose callbacks mix [i] prefixes inconsistently; git_file_transport has no verbose instrumentation for agent-mode diagnostics. |
| Devx Ux Expert | 0 | 0 | 2 | Git-transport-first is invisible on happy path; embedded-subpath guard gives actionable error with correct fix syntax. Ship. |
| Supply Chain Security Expert | 0 | 1 | 1 | Containment model is sound: validate_path_segments before git, ensure_path_within after checkout, PathTraversalError blocks fallback probe. Two recommended hardening items. |
| Oss Growth Hacker | 0 | 0 | 2 | Good story-shaped feature for GitLab users; CHANGELOG entry is release-ready; docs update the auth page correctly. |
| Auth Expert | 0 | 1 | 1 | Auth delegation is correct (build_repo_url_fn + git_env reuse); PathTraversalError hard-fail prevents fallback bypass; minor stderr credential leak risk in error path. |
| Doc Writer | 0 | 1 | 2 | Docs are accurate and git-transport-first framing is correct; consolidate duplicated GitLab path: transport prose into one source-of-truth page. |
| Test Coverage Expert | 0 | 0 | 1 | All critical surfaces covered: git transport, PathTraversalError non-swallow, embedded-subpath parse guard, credential-leak guard in fallback diagnostics. Ship. |
| Performance Expert | 0 | 2 | 1 | Git-transport-first is the correct perf direction; per-file init+fetch is acceptable for current path-file volumes; same-repo batching is the clear next-step win. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 5 follow-ups
- [Supply Chain Security Expert] Sanitize stderr before embedding in RuntimeError to prevent credential echo from older git versions -- Test evidence confirms stderr is included in the raised error. Phase-A sanitized the current caller, but the module is reusable and a future direct caller could surface auth URLs.
- [Auth Expert] Set GIT_TERMINAL_PROMPT=0 in git subprocess env to prevent interactive credential hangs -- Without it, failed SSH auth can hang 120s before REST fallback fires. Single env-var fix eliminates a real user-visible timeout on misconfigured hosts.
- [Doc Writer] Consolidate GitLab transport/fallback prose into authentication.md as single source of truth -- Duplicated fallback details across authentication.md and manage-dependencies.md can drift on the next auth change. Trim manage-dependencies to mechanism plus link.
- [Python Architect] Expose build_repo_url as a public DownloadDelegate method instead of injecting a private-method reference -- Functional now, but private-method cross-boundary access can confuse contributors tracing the call graph.
- [Performance Expert] Batch path: files sharing (url, ref) into one sparse clone (deferred by design) -- N fetches for N files from the same repo is the clear next perf win once the transport is proven stable in production.
Architecture
classDiagram
direction LR
class GitHubPackageDownloader {
<<Facade>>
+_strategies: DownloadDelegate
+_build_repo_url(repo_ref, ...) str
+git_env: dict
}
class DownloadDelegate {
<<Delegate>>
+_host: GitHubPackageDownloader
+download_gitlab_file(dep_ref, file_path, ref) bytes
+build_repo_url(repo_ref, ...) str
}
class GitFileTransport {
<<Module>>
+fetch_file_via_git_sparse(dep_ref, file_path, ref, build_repo_url_fn, git_env) bytes
}
class DependencyReference {
<<ValueObject>>
+host: str
+repo_url: str
+_check_no_embedded_subpath(url) None
}
class PathSecurity {
<<Module>>
+validate_path_segments(path, context)
+ensure_path_within(path, base_dir)
}
GitHubPackageDownloader *-- DownloadDelegate : owns
DownloadDelegate ..> GitFileTransport : calls primary
GitFileTransport ..> PathSecurity : validates
GitFileTransport ..> DependencyReference : reads
DownloadDelegate ..> PathSecurity : re-raises traversal
DependencyReference ..> DependencyReference : checks embedded subpaths
flowchart TD
A[DownloadDelegate.download_gitlab_file] --> B{fetch_file_via_git_sparse}
B --> C[validate_path_segments before git]
C -->|clean| D[temp work tree]
D --> E[git init and remote add]
E --> F{file in subdirectory}
F -->|yes| G[sparse-checkout cone parent]
F -->|no| H[skip cone]
G --> I[git fetch blob:none depth 1]
H --> I
I --> J[git checkout FETCH_HEAD]
J --> K[ensure_path_within target]
K -->|contained| L[read bytes]
K -->|escape| X[raise PathTraversalError, no fallback]
B -->|other failure| R[GitLab REST fallback]
R --> S[return REST bytes]
sequenceDiagram
participant User as apm install
participant DD as DownloadDelegate
participant GFT as git_file_transport
participant Git as git subprocess
participant REST as GitLab REST v4
User->>DD: download_gitlab_file(dep, path, ref)
DD->>GFT: fetch_file_via_git_sparse(dep, path, ref)
GFT->>Git: sparse partial checkout
alt git succeeds
GFT-->>DD: bytes
DD-->>User: file bytes
else git fails (not traversal)
GFT-->>DD: RuntimeError
DD->>REST: raw file request
REST-->>DD: bytes
DD-->>User: file bytes
else traversal or symlink escape
GFT-->>DD: PathTraversalError
DD-->>User: propagate error
end
Recommendation
Merge now. CI is green, all critical security surfaces are test-covered, and the feature unblocks a real class of self-hosted GitLab users. Track stderr sanitization and GIT_TERMINAL_PROMPT as fast follow-up issues before the next minor release.
Full per-persona findings
Python Architect
- [recommended] fetch_file_via_git_sparse accesses self._host._build_repo_url via cross-boundary injection at
src/apm_cli/deps/download_strategies.py:659
DownloadDelegate passes self._host._build_repo_url as build_repo_url_fn. Prefer a public delegate method or strategies method to avoid coupling the new module to downloader internals.
Suggested: Expose a build_repo_url on DownloadDelegate or pass the strategies method directly. - [nit] _APM_PRIMITIVE_DIRS type annotation is redundant at
src/apm_cli/models/dependency/reference.py:201
The bare frozenset annotation could be frozenset[str] for consistency, or omitted.
Suggested: Use _APM_PRIMITIVE_DIRS: frozenset[str] = frozenset({...}).
Cli Logging Expert
- [nit] Inconsistent [i] prefix between git-success and REST-fallback verbose_callback messages at
src/apm_cli/deps/download_strategies.py:663
Git success emits no prefix while fallback and REST success prepend [i]. Existing downloader callbacks usually leave symbol rendering to the caller.
Suggested: Standardize verbose_callback payloads in a follow-up. - [nit] git_file_transport.py emits no verbose/debug output for git command plan at
src/apm_cli/deps/git_file_transport.py
The module runs several git subprocesses but gives no plan/timing breadcrumbs for APM_DEBUG users. The fallback catch gives coarse signal, so this is polish.
Suggested: Add safe _debug output summarizing sparse cone/ref/host without auth URLs.
Devx Ux Expert
- [nit] Embedded-subpath error message bakes [x] into a ValueError string at
src/apm_cli/models/dependency/reference.py:697
[x] is a rendering-layer convention. If another caller wraps the ValueError, users could see duplicate symbols. Current paths render directly, so this is not urgent. - [nit] Git success verbose callback lacks the [i] prefix used by adjacent GitLab fallback messages at
src/apm_cli/deps/download_strategies.py:663
Inconsistent prefixing in a single method can make verbose output look uneven.
Suggested: Standardize callback payloads at the caller layer or within this method.
Supply Chain Security Expert
- [recommended] git stderr in RuntimeError may echo token-bearing remote URL at
src/apm_cli/deps/git_file_transport.py:123
build_repo_url_fn can return auth-embedded URLs. The current caller swallows the error after Phase-A sanitization, but a future direct caller could surface stderr.
Suggested: Sanitize stderr before embedding, or omit stderr from the raised message. - [nit] Set GIT_TERMINAL_PROMPT=0 in subprocess env to prevent credential-helper hangs at
src/apm_cli/deps/git_file_transport.py:81
If git cannot authenticate silently, it may prompt and then time out. This is robustness, not a secret leak.
Suggested: Set git_env = {**git_env, 'GIT_TERMINAL_PROMPT': '0'} before subprocess.run.
Oss Growth Hacker
- [nit] Embedded-subpath error message could link to docs for higher conversion at
src/apm_cli/models/dependency/reference.py
The Support installing individual skills/agents from monorepo packages (including private repo support) #872 error is actionable, but a docs URL would turn confusion into docs discovery and reduce repeat issues.
Suggested: Append a short docs URL to the error string. - [nit] CHANGELOG entry leads with implementation rather than user benefit at
CHANGELOG.md
The entry is accurate but could be stronger for release notes if it opens with the self-hosted GitLab benefit.
Suggested: Consider swapping benefit and mechanism at release cut.
Auth Expert
- [recommended] git_file_transport.py does not set GIT_TERMINAL_PROMPT=0, risking a hang on interactive credential prompt at
src/apm_cli/deps/git_file_transport.py:81
Non-interactive git operations should suppress terminal prompts. If SSH auth fails and no credential helper resolves, git may wait until the 120s timeout before REST fallback.
Suggested: Set GIT_TERMINAL_PROMPT=0 in git_env. - [nit] result.stderr in RuntimeError may contain auth URL echoed by older git at
src/apm_cli/deps/git_file_transport.py:122
Modern git usually redacts, but APM supports git 2.25+ and older behavior may echo credentials. Phase-A sanitized the caller path, but direct future callers should be protected.
Suggested: Redact token-bearing URL patterns or cap/suppress stderr.
Doc Writer
- [recommended] GitLab path: transport and GITLAB_APM_PAT fallback are restated in both authentication and manage-dependencies docs at
docs/src/content/docs/consumer/manage-dependencies.md:116
Credential/fallback behavior belongs primarily in authentication.md; manage-dependencies.md can state the fetch mechanism and link out. Duplicating fallback details risks future drift.
Suggested: Trim manage-dependencies to fetch-mechanism and containment, then link to authentication.md. - [nit] Extra blank lines before the GHE Cloud heading in the packaged guide at
packages/apm-guide/.apm/skills/apm-usage/authentication.md
Cosmetic whitespace; one blank line between sections is the house style.
Suggested: Collapse to one blank line before the heading. - [nit] Phrase 'is the fix for' reads like release-note voice in user docs at
docs/src/content/docs/consumer/manage-dependencies.md:119
Docs should state the capability in present-tense reader language rather than narrating the PR.
Suggested: Use capability voice.
Test Coverage Expert
- [nit] Unit-level TestDownloadGitlabFile lacks its own PathTraversalError non-swallow assertion at
tests/unit/deps/test_download_strategies_selection.py
The behavior is covered through the GitHubPackageDownloader facade. A direct unit assertion would make the signal faster during refactors, but current integration-tier coverage is adequate.
Suggested: Add a direct DownloadDelegate test for PathTraversalError propagation and no REST fallback.
Performance Expert
- [recommended] Each path: file invokes a fresh sparse clone; same-repo batching is the next perf win at
src/apm_cli/deps/git_file_transport.py:83
Multiple path: files from the same host/repo/ref currently pay N fetches. The PR scope explicitly defers batching, which is the right boundary, but it should be next.
Suggested: Follow-up: batch path: files sharing (url, ref) into one sparse clone. - [recommended] fetch_file_via_git_sparse does not consume a locked SHA when one is already known at
src/apm_cli/deps/git_file_transport.py:108
Passing branch names works, but the install pipeline may already know the SHA. Threading that through would reduce future ref-resolution work, especially once batching exists.
Suggested: Accept optional locked_sha and fetch that when available. - [nit] TemporaryDirectory context manager would simplify cleanup at
src/apm_cli/deps/git_file_transport.py:83
Using tempfile.TemporaryDirectory is shorter and avoids ignore_errors hiding cleanup issues.
Suggested: Replace mkdtemp/finally/rmtree with tempfile.TemporaryDirectory.
This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 0 | 0 | Transport boundary, cache lifecycle, and concurrency model are coherent. |
| Cli Logging Expert | 0 | 0 | 0 | User-facing messages are actionable; debug-only transport detail is bounded. |
| Devx Ux Expert | 0 | 0 | 0 | Happy path is invisible; malformed git URL errors point at path:. |
| Supply Chain Security Expert | 0 | 0 | 0 | Path containment, token redaction, ref hard-fail, and REST fallback boundaries are defended. |
| Oss Growth Hacker | 0 | 0 | 0 | Release story is clear: self-hosted GitLab path installs now just work. |
| Auth Expert | 0 | 0 | 0 | Auth env reuse, prompt suppression, and PAT fallback are correct. |
| Doc Writer | 0 | 0 | 0 | Docs, packaged guide, CHANGELOG, and PR body match the implementation. |
| Test Coverage Expert | 0 | 0 | 0 | Critical paths have unit, delegate, concurrency, and real-git fixture coverage. |
| Performance Expert | 0 | 0 | 0 | blob:none, file-level sparse paths, and same-run reuse are appropriate for this scope. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Recommendation
Ship now. All in-scope follow-ups from the prior passes have been folded, the branch is clean, local lint/tests pass, and PR checks report 14 successful / 1 skipped / 0 pending.
Full per-persona findings
No findings remain after folded follow-ups.
This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.
…t API For GitLab-sourced deps (SaaS or self-hosted), path: file fetches now go through git sparse/partial checkout (--filter=blob:none + cone sparse-checkout) rather than the host REST API. This is the primary path and fixes self-hosted GitLab instances that return HTTP 410 (REST API disabled), because git transport reuses the same SSH keys and git credential helpers already in use for clones. Design (ratified by board decision): - git-transport-first: fetch_file_via_git_sparse() in the new deps/git_file_transport.py module performs git init + remote add + sparse-checkout init --cone + fetch --filter=blob:none --depth=1 + checkout - GITLAB_APM_PAT fallback: if git transport raises, DownloadDelegate.download_gitlab_file() falls back to the existing GitLab REST v4 API (mirrors ADO_APM_PAT pattern) - Path containment: ensure_path_within() is applied to the materialized file path after checkout, rejecting symlink/traversal escapes from cloned repos - validate_path_segments() rejects .. traversal sequences before any git work - Reuses the existing clone auth environment (git_env from the downloader); no second origin or cached clone TDD coverage (tests/test_gitlab_git_transport.py): - Acceptance: git-sourced dep with path: fetches file via git (no REST API call) - Self-hosted GitLab API-410 no longer fails - ensure_path_within() rejects symlink-escaping path: after checkout - validate_path_segments() rejects .. traversal before git work - GITLAB_PAT fallback path exercised when git transport raises - Root-level files skip cone setup (no sparse-checkout init) - git failure raises RuntimeError with descriptive message - File missing after checkout raises RuntimeError Docs: updated consumer/authentication.md, consumer/manage-dependencies.md, apm-usage/authentication.md, apm-usage/dependencies.md, and CHANGELOG.md. Stacked on #1732 (#1571). Closes #1014. Credit: @cossons for root-cause analysis of the self-hosted GitLab 410 repro and the ratified design (git-transport-first + thin PAT fallback). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a parse-time guard that detects when a user embeds an APM primitive subpath inside an explicit git URL form (SCP git@host:path, ssh://, or https://), e.g.: git: git@github.com:org/repo/skills/hello-world.git Such URLs cause git to fail later with a cryptic 'not a valid repository name' error. The guard detects any known APM primitive directory name (skills, agents, prompts, instructions, chatmodes, collections, contexts, memory) appearing as a non-last interior path segment of an explicit git URL, and raises a friendly actionable error pointing at the supported `path:` key instead: [x] A subpath cannot be embedded in a git URL. Use the `path:` key instead: `git: <repo-url>` + `path: <primitive>/<name>` (or the shorthand `org/repo/<primitive>/<name>`). The monorepo-skill-install capability already ships via the `path:` key and the bare shorthand (e.g. `org/repo/skills/hello-world`). No new install feature is added. False-positive check: - GitLab subgroups (git@gitlab.com:group/subgroup/repo.git): safe, no primitive segment in interior - Azure DevOps org/project/repo forms: safe, ADO paths never use primitive names - Plain valid git URLs (2-segment path): safe, length guard returns early - Bare shorthand with subpath (org/repo/skills/foo): safe, not an explicit git URL form The guard is implemented as DependencyReference._check_no_embedded_subpath and called from DependencyReference.parse() after the shorthand-alias and fqdn-conflict guards, before virtual-package detection. Refs #872 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two panel follow-ups on the #1014 GitLab git-transport surface. Supply-chain: download_gitlab_file caught a bare `except Exception` around the git-transport attempt, so a PathTraversalError raised by the path-validation / symlink-containment guards in fetch_file_via_git_sparse was silently swallowed and the request fell through to the REST transport. A rejected traversal must not gain a second transport to probe. PathTraversalError now propagates unchanged; only genuine transport failures fall back to REST. CLI logging: the git->REST fallback transition was `_debug`-only, so triagers had no signal which transport answered a 410. The fallback now emits a verbose `[i]` note carrying the git failure reason, and the REST success notes attribute the GitLab REST API explicitly. Adds a regression test asserting a PathTraversalError in the git path is not swallowed into the REST fallback (mutation-break verified). Updates existing verbose-callback tests that exercised the unmocked-git fallback path to the new transport-attributed messages. addresses CEO follow-up (Supply Chain) and (CLI Logging) on PR #1740 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The #872 guard `_check_no_embedded_subpath` flagged any non-last path segment matching an APM primitive directory name, which produced two false positives on legitimate URLs: - A GitLab subgroup literally named after a primitive, e.g. `git@gitlab.com:group/skills/repo.git`, where `skills` is a subgroup at index 1 and `repo` is the real repository. - Azure DevOps virtual-path URLs, e.g. `https://dev.azure.com/org/proj/_git/repo/instructions/x`, where the primitive segment is the supported ADO virtual path after `_git/repo`. The embedded-subpath shape is `org/repo` + `<primitive>/<name>`, so a primitive segment now only fires when preceded by a complete org/repo prefix (index >= 2), and ADO URLs (identified by the `_git` marker) are skipped entirely. The canonical malformed form `org/repo/skills/<name>` that #872 targets still hard-fails; no GitHub or GitLab repo path uses `_git`, so real detection is unaffected. A residual deep-subgroup ambiguity (`group/sub/skills/repo`) is documented in code. Adds a regression test for the primitive-named subgroup (mutation-break verified); the ADO false positive is covered by the existing TestParseAdoUrls suite. addresses CEO follow-up (DevX) on PR #1740; Refs #872 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Avoid echoing raw git transport exceptions during the GitLab REST fallback and keep fallback tests hermetic by mocking the git transport failure path. Skip the symlink containment test when the platform cannot create symlinks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
e87567a to
96cfb00
Compare
main tightened the file-length guardrail from 2450 to 2100 lines (Stage 1); after merging main, reference.py was 2191 lines. Move the pure, stateless identity helpers (build_dependency_unique_key, build_canonical_dependency_string, _path_segment_pattern, the semver-range guards + segment-pattern constants) into a sibling models/dependency/identity.py and re-export them from reference.py so deps/lockfile.py and install/phases/resolve.py import sites stay unchanged. Behavior-preserving; reference.py now 2099 lines (under guardrail). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 2 | 1 | Clean transport extraction and correct layering; identity helper split is sound. Architecture is shippable with minor maintainability notes. |
| CLI Logging Expert | 0 | 0 | 1 | CLI/logging hygiene is solid: token redaction is tested, prompts are suppressed, #872 error is actionable, and output remains ASCII-safe. |
| DevX UX Expert | 0 | 2 | 1 | Strong package-manager UX: the happy path matches "if git clone works, apm install works" and the #872 guard gives an actionable fix before git runs. |
| Supply Chain Security Expert | 0 | 0 | 1 | Security posture is sound: pre-git validation, post-checkout containment, no shell=True, ref option guard, credential redaction, and no fallback on path-security failures. |
| OSS Growth Hacker | 0 | 1 | 1 | Adoption-positive PR: self-hosted GitLab becomes a no-token-needed story, contributor credit is explicit, and docs/changelog are reader-facing. |
| Auth Expert | 0 | 0 | 2 | Auth design is correct: git transport reuses build_repo_url/AuthResolver credentials, suppresses prompts, redacts tokens, and keeps REST PAT fallback secondary. |
| Doc Writer | 0 | 1 | 1 | Docs accurately reflect git-transport-first GitLab path: fetch with PAT as REST fallback; minor non-bloat trim available. Ship. |
| Test Coverage Expert | 0 | 0 | 1 | Critical security/containment surfaces are well-covered with mutation-break-credible traps; only optional gap is a full CLI e2e for GitLab git-transport path deps. |
| Performance Expert | 0 | 3 | 0 | The performance shape is right: blob:none, depth=1, file-level sparse paths, and same-run checkout reuse avoid full clone and extra API round trips. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 3 follow-ups
- [DevX UX Expert] Introduce a typed
GitFileTransportFileNotFoundto avoid double-error noise on fallback -- prevents an extra REST round trip and gives users a single, clear failure message when a file genuinely does not exist in the ref. - [Test Coverage Expert] Add an e2e
apm installfixture covering GitLabpath:dep materialization -- belt-and-suspenders coverage for the full install path; the transport and delegate layers are already integration-tested. - [DevX UX Expert] Surface git-to-REST fallback reason at
--verboselevel, not onlyAPM_DEBUG-- users troubleshooting slow installs with--verbosedeserve to see why REST was tried without the debug env var.
Architecture
classDiagram
direction LR
class DownloadDelegate {
<<Strategy>>
+download_gitlab_file()
-_download_gitlab_file_via_git()
-_git_file_transports dict
}
class GitSparseFileTransport {
<<ReusableResource>>
+fetch_file(file_path) bytes
+close()
-_ensure_checkout()
-_run(cmd)
}
class DependencyReference {
<<ValueObject>>
+host
+repo_url
+virtual_path
}
class IdentityHelpers {
<<PureModule>>
+build_dependency_unique_key()
+build_canonical_dependency_string()
}
class PathSecurity {
<<Guard>>
+validate_path_segments()
+ensure_path_within()
}
DownloadDelegate *-- GitSparseFileTransport : caches per repo/ref
DownloadDelegate ..> DependencyReference : reads
GitSparseFileTransport ..> PathSecurity : guards
DependencyReference ..> IdentityHelpers : delegates identity
flowchart TD
A[apm install resolves GitLab path dep] --> B[DownloadDelegate.download_gitlab_file]
B --> C[GitSparseFileTransport cached per host/repo/ref/port]
C --> D[validate_path_segments before git]
D --> E[git init + sparse-checkout --no-cone]
E --> F[git fetch --filter=blob:none --depth=1]
F --> G[git checkout FETCH_HEAD]
G --> H[ensure_path_within then read bytes]
H --> I[return file content]
C --> J[transport failure]
J --> K[secondary GitLab REST fallback]
D --> L[path/security failure: no fallback]
Recommendation
All panelists agree, no blocking-severity findings exist, security and containment are tested, and the PR directly advances APM's core positioning. Merge at maintainer discretion. The three follow-ups above are post-merge quality improvements worth tracking in issues but do not hold this ship.
Full per-persona findings
Python Architect
- [recommended] Document close() lock ordering invariant for future maintainers at
src/apm_cli/deps/git_file_transport.py:105
close()waits for active fetches via_state, then cleans up under_lock. Current ordering is safe, but a short comment would keep future changes from inverting lock order. - [recommended] Clarify intended public API of
identity.pyhelper extraction atsrc/apm_cli/models/dependency/identity.py:1
reference.pyimports underscore-prefixed helper symbols fromidentity.pyfor back-compat. This is intentional extraction, but an__all__or comment would clarify which names are meant to remain re-exported. - [nit] Finalizer cleanup path could use a one-line explanatory comment at
src/apm_cli/deps/download_strategies.py:54
weakref.finalizeis correct but subtle; a comment would help readers understand the dict capture.
CLI Logging Expert
- [nit] Verbose success wording differs from REST success wording at
src/apm_cli/deps/download_strategies.py:727
Git path success says "Fetched file via git transport" while older paths use nearby download/fetch phrasing. This is harmless but could be aligned for log grepping.
DevX UX Expert
- [recommended] Missing file after sparse checkout may fall through to REST fallback and produce double-error noise at
src/apm_cli/deps/git_file_transport.py:131
A file absent in the git ref currently raisesRuntimeError, whichdownload_gitlab_filecatches as fallback-eligible. This can cost an extra REST call and blur the primary error.
Suggested: Consider a dedicatedGitFileTransportFileNotFoundsubclass and delegate handling if this becomes noisy. - [recommended] Verbose fallback path is debug-only rather than verbose-visible at
src/apm_cli/deps/download_strategies.py:737
Fallback from git transport to REST is logged through_debug/APM_DEBUG. Users running with--verbosebut notAPM_DEBUGwill not see why REST was tried.
Suggested: Optionally callverbose_callbackbefore REST fallback. - [nit] Timeout error could include the 120s cap at
src/apm_cli/deps/git_file_transport.py:193
A timeout message that includes the cap is more actionable for CI users.
Supply Chain Security Expert
- [nit] Credential redaction focuses on HTTP(S) URL credentials at
src/apm_cli/deps/git_file_transport.py:58
The current regex matches the PR's token-bearing HTTPS stderr risk. A broader future pass could redact common token literals outside URLs, but no current leak was found.
OSS Growth Hacker
- [recommended] CHANGELOG Added line could lead more with user benefit at
CHANGELOG.md:12
The embedded-subpath guard entry is already clear; slightly more user-outcome wording would make release notes more quotable. - [nit] Docs could visually highlight the zero-config GitLab promise at
docs/src/content/docs/consumer/authentication.md:50
A callout around "If git clone works, apm install works" would help skimmers, but the text is already present and accurate.
Auth Expert
- [nit] Redaction regex could accept slash characters in credential segment at
src/apm_cli/deps/git_file_transport.py:61
GitLab tokens do not need this for the primary scope, but[^@\s]+@would be a little more defensive than[^/@\s]+@. - [nit]
GIT_ASKPASS='echo'could use a comment atsrc/apm_cli/deps/git_file_transport.py:82
This portable prompt-suppression choice is correct but non-obvious to future maintainers.
Doc Writer
- [recommended]
consumer/authentication.mdrepeats "if git clone works, apm install works" in the same section atdocs/src/content/docs/consumer/authentication.md:50
The bold lead line already states the promise; the paragraph repeats it. Trimming the repeat would keep the docs lean without losing meaning.
Suggested: Drop the trailing repeat clause and keep the bold lead. - [nit] Support installing individual skills/agents from monorepo packages (including private repo support) #872 target doc page does not name the embedded-subpath anti-pattern at
docs/src/content/docs/consumer/manage-dependencies.md:114
The parser error is self-contained and the page documents thepath:remedy, but a one-line note would improve discoverability.
Test Coverage Expert
- [nit] No full
apm installCLI e2e covers GitLabpath:dep materialization attests/integration/test_install_gitlab_git_transport.py
Transport and delegate integration coverage is strong enough for this PR; an e2e install test would be additional belt-and-suspenders coverage.
Suggested: Consider a future e2e fixture invokingapm installagainst a local git repo classified as GitLab.
Proof (missing at):tests/integration/test_install_gitlab_git_transport.py::test_apm_install_gitlab_path_dep_materializes_file-- proves: Fullapm installwith a GitLab path dependency produces the installed file on disk [vendor-neutral,devx]
assert (install_dir / 'expected_file.md').exists()
Performance Expert
- [recommended] Incremental sparse expansion re-checkouts after late-arriving paths at
src/apm_cli/deps/git_file_transport.py:172
If paths arrive after the first checkout, the transport updates sparse paths and re-checkouts. Same-run prefetch covers the common parallel case, so this is acceptable for expected small path counts. - [recommended] Transport lock serializes same-repo path fetches by design at
src/apm_cli/deps/git_file_transport.py:126
Serializing sparse-checkout mutation is necessary for correctness. This caps parallelism within one repo/ref but avoids working-tree races. - [recommended] Transport construction occurs under the delegate dictionary lock at
src/apm_cli/deps/download_strategies.py:673
The current constructor is cheap and network-free, so contention risk is low. A future per-key future pattern is only needed if construction becomes expensive.
This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.
Sync the Stage 2 complexity/file-length refactor branch with main's 22 feature commits (Hermes #1726, Kiro IDE #1741, multi-host dep identity #1735, same-repo remote path deps #1732, git_file_transport #1740, revision pins #1738, marketplace sourceBase/source parity/inherit description #1736/#1739/#1755, pack --archive .zip #1720, mcp optional registry inputs #1734, and the v0.19.0/v0.20.0 releases). Conflict resolution preserved both sides: main's new features ported through the branch's extracted sibling modules, branch's tightened ruff thresholds (max-statements=120, max-branches=40, max-complexity=35, max-returns=8, max-args=12) and 800-line file limit retained. All 7 CI-mirror lint gates pass; full unit suite green (17099 passed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
TL;DR
Stacked on #1732 (#1571) -- must merge after it. This PR fixes self-hosted GitLab instances where
path:-specified files fail with HTTP 410 (REST API disabled) by routing the fetch through git sparse/partial checkout instead of the host REST API.GITLAB_APM_PATis kept as a thin fallback for API-only environments.Credit: @cossons for the root-cause analysis and ratified git-transport-first design. @cossons -- please validate against your self-hosted GitLab 410 repro before we merge.
Closes #1014.
Problem (WHY)
Self-hosted GitLab instances often disable the REST API or do not support the same auth flow as GitHub. When
apm installencounters a dep like:APM fetches the
path:-specified file via the GitLab REST v4 API (/projects/.../repository/files/.../raw). On a self-hosted instance with the API disabled, this returns HTTP 410 and the install fails. A PAT token cannot fix a 410 -- the API is gone.Approach (WHAT)
git-transport-first (Option C-lite, per board decision): when a dep's source is a git/SSH repo classified as GitLab, extract the
path:-specified file through the SAME git fetch already in use for clones, using sparse/partial checkout:git init+git remote add origin <auth-url>git sparse-checkout init --no-cone+git sparse-checkout set --no-cone <file-path...>(exact file-level sparse paths)git fetch --filter=blob:none --depth=1 origin <ref>-- downloads trees + commits only; blobs are lazygit checkout FETCH_HEAD-- materializes only the accumulated sparse file pathsensure_path_within(target, work_dir)-- rejects symlink/traversal escapesIf git transport fails, the existing GitLab REST v4 API is tried as a thin fallback (mirroring the
ADO_APM_PATpattern). Path containment is enforced unconditionally.Implementation (HOW)
New module:
src/apm_cli/deps/git_file_transport.pyGitSparseFileTransportreusable class for one repo/ref sparse checkoutfetch_file_via_git_sparse(...)compatibility wrapper for single-file callers/tests--no-cone) for root and nested files, always paired with--filter=blob:noneGIT_TERMINAL_PROMPT=0on git subprocesses to avoid interactive auth hangsvalidate_path_segments()before git work andensure_path_within()after checkoutModified:
src/apm_cli/deps/download_strategies.pyDownloadDelegate.download_gitlab_file()tries git transport first and reuses one cached transport per GitLab host/repo/ref/port for multiplepath:filesPathTraversalErrorhard-fails with no REST fallbackDownloadDelegate.build_repo_urlseam instead of injecting the downloader's private_build_repo_urlTests:
tests/test_gitlab_git_transport.pyplus focused delegate/parser coverage--filter=blob:noneand file-level sparse paths, including root filesensure_path_withinGIT_TERMINAL_PROMPT=0; refs starting with-are rejected before gitpath:files reuse one cached sparse checkout, including a parallel resolver regression trappath:without rendering-layer status symbolsDocs: consumer/authentication.md, consumer/manage-dependencies.md, apm-usage/authentication.md, apm-usage/dependencies.md, CHANGELOG.md.
Trade-offs
path:files from the same GitLab repo/ref share one sparse checkout, avoiding N fetches while keeping cleanup bounded to the downloader lifecycle.Validation evidence
How to test
apm.ymlwith a dep like:git clone git@...succeeds).GITLAB_HOST=gitlab.corp.example.comso APM classifies the host.apm install-- the file should be fetched without any 410 error.Board scope + fences
GIT_TERMINAL_PROMPT=0set on git subprocesses)-fail before subprocess)DX rider: friendly error for subpath embedded in git URL (Refs #872)
What was added
A parse-time guard (
DependencyReference._check_no_embedded_subpath) that detects when a user accidentally embeds an APM primitive subpath inside an explicit git URL form (SCPgit@host:path,ssh://, orhttps://), e.g.:Without this guard, git would later fail with a cryptic
fatal: 'org/repo/skills/hello-world' does not appear to be a git repositorymessage. APM now fires before any git subprocess and emits:What was NOT built
The monorepo-skill-install capability already ships via the
path:key and the bare virtual-path shorthand:No new install feature was added. Issue #872 remains open as the tracker for the deferred
skills: [...]subset-filter sugar (board deferred that part).Heuristic / false-positive check
The guard checks for a known APM primitive directory name (
skills,agents,prompts,instructions,chatmodes,collections,contexts,memory) as a non-last interior path segment of an explicit git URL. Shapes that must NOT trigger -- and were verified:git@gitlab.com:group/subgroup/repo.git(GitLab subgroup)git@github.com:org/repo.git(plain 2-segment)ssh://git@host/owner/repo.git(plain ssh://)https://gitlab.com/acme/coding-standards.git(plain https://)org/repo/skills/hello-world(bare shorthand, SUPPORTED)Tests added
Focused tests in
tests/unit/test_generic_git_urls.py::TestEmbeddedSubpathInGitUrlcovering: SCP error, ssh:// error, https:// error, message mentionspath:, agents primitive, parse_from_dict path, and all the no-regression cases above.Refs #872