Skip to content

feat(remote): add local fallback when api.wiki.codes is unreachable#535

Open
idea404 wants to merge 4 commits into
justrach:release/0.2.5825from
idea404:feat/remote-local-fallback-534
Open

feat(remote): add local fallback when api.wiki.codes is unreachable#535
idea404 wants to merge 4 commits into
justrach:release/0.2.5825from
idea404:feat/remote-local-fallback-534

Conversation

@idea404

@idea404 idea404 commented Jun 5, 2026

Copy link
Copy Markdown

Closes #534

When api.wiki.codes is unreachable, automatically fall back to a local shallow clone + index for tree, outline, search, read, symbol, and deps actions.

Changes:

  • New src/remote_cache.zig module for cache management (~/.codedb/remote-cache/)
  • Modified handleRemote to try remote first, then local fallback on failure
  • Added clean_cache action to purge cached repos
  • Updated tool schema to document fallback behavior

Fallback chain:

  1. Try api.wiki.codes (existing behavior)
  2. On failure, check if action is locally serviceable
  3. Ensure repo is cached locally (git clone --depth=1 + codedb snapshot)
  4. Load cached snapshot and dispatch to local handlers

Verification:

  • zig build — compiles clean
  • zig build test — 709/710 pass (1 pre-existing flaky perf test)
  • E2E MCP test — 20/20 pass

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8d9f0f36b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/mcp.zig Outdated
if (std.mem.eql(u8, action, "tree")) {
handleTree(alloc, out, ctx.explorer);
} else if (std.mem.eql(u8, action, "outline")) {
handleOutline(alloc, args, out, ctx.explorer);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Map remote outline query before fallback

When api.wiki.codes is unavailable, a documented remote outline call such as {"action":"outline","query":"src/foo.zig"} reaches this fallback with only query populated, but handleOutline reads the file path from path and will return error: missing 'path' argument. Normalize query into the local handler's path argument before dispatching so the advertised fallback actually works for valid remote outline requests.

Useful? React with 👍 / 👎.

Comment thread src/mcp.zig Outdated
} else if (std.mem.eql(u8, action, "search")) {
handleSearch(alloc, args, out, ctx.explorer);
} else if (std.mem.eql(u8, action, "read")) {
handleRead(io, alloc, args, out, ctx.explorer);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Map remote read line ranges before fallback

For action=read, the remote API schema and forwarding logic use a lines string like "10-60", but this fallback passes the original args directly to handleRead, which only honors line_start/line_end. When the upstream is down and a caller requests a range, the fallback ignores it and returns the entire file, which can produce much larger responses than requested; parse/translate lines before calling the local read handler.

Useful? React with 👍 / 👎.

Comment thread src/mcp.zig Outdated
} else if (std.mem.eql(u8, action, "read")) {
handleRead(io, alloc, args, out, ctx.explorer);
} else if (std.mem.eql(u8, action, "symbol")) {
handleSymbol(alloc, args, out, ctx.explorer);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Map remote symbol query before fallback

A valid remote symbol request supplies the identifier as query (and the code above validates that), but this fallback calls handleSymbol unchanged even though the local handler requires name. In the upstream-down case, {"action":"symbol","query":"Foo"} therefore falls back to error: missing 'name' argument instead of returning symbols; translate query to name for the local fallback path.

Useful? React with 👍 / 👎.

@justrach

justrach commented Jun 7, 2026

Copy link
Copy Markdown
Owner

Hi @idea404 — thank you for tackling this! 🙏 The api.wiki.codes outages (#508/#534) are a real pain, and a local clone-and-index fallback is a genuinely good answer.

1. Could you retarget the base mainrelease/0.2.5825? (Edit → base branch.) We integrate on the release branch — we've added a CI hint that'll nudge this automatically going forward.

2. Verified: it compiles and the full suite stays green, no regressions. The security-sensitive parts are well done — git clone via an argv array (no shell injection), URL hardcoded to github.com (no SSRF), and wikiSlugForRepo rejects ../// so the cache path can't traverse out (the failure-path deleteTree is safe).

A few things before we can merge:

  1. No test for the new behavior — the clone-fallback path is unexercised; our repo rule is every PR ships a test (one that forces the upstream unreachable and asserts the local fallback serves results would do it).
  2. MAX_CACHED_REPOS + the TTL helpers (cleanStaleEntries/isCacheFresh) are never calledensureCached gates only on the snapshot existing, so the cache grows unbounded and never expires.
  3. No clone timeout / size cap — a huge or hanging repo would block the handler and fill the disk when the fallback fires.
  4. Concurrent fallback for the same repo can have one clone deleteTree another's dir — a lock or clone-to-temp-then-rename fixes it.

All very doable and we're happy to help. If we don't hear back in ~1 day, we may pick these up in a follow-up building on your work, with you credited (author/co-author + release notes). Thanks again! 🙏

When the remote API fails, automatically fall back to a local shallow
clone + index for tree, outline, search, read, symbol, and deps actions.

- Add src/remote_cache.zig for cache management (~/.codedb/remote-cache/)
- Modify handleRemote to try remote first, then local fallback
- Add clean_cache action to purge cached repos
- Update tool schema to document fallback behavior

Closes justrach#534
@idea404 idea404 changed the base branch from main to release/0.2.5825 June 8, 2026 08:19
- Enforce TTL: ensureCached uses isCacheFresh, evicts stale entries
- Enforce MAX_CACHED_REPOS: evictIfNeeded runs LRU eviction before clone
- Clone timeout: git -c http.lowSpeedLimit/Time prevents hanging clones
- Size cap: --single-branch --no-tags minimizes clone size
- Race fix: clone to temp dir then atomic rename prevents concurrent clobber
- Add tests for TTL enforcement and clone-to-temp-then-rename
@idea404 idea404 force-pushed the feat/remote-local-fallback-534 branch from a8d9f0f to 774bd1e Compare June 8, 2026 08:27
@idea404

idea404 commented Jun 8, 2026

Copy link
Copy Markdown
Author

Thanks for the thorough review! All four points addressed in 774bd1e:

  1. Retargeted to release/0.2.5825.

  2. Tests addedisCacheFresh TTL enforcement and clone-to-temp-then-rename atomicity in test_mcp.zig.

  3. TTL + LRU wired upensureCached now checks freshness (not just existence), runs cleanStaleEntries before cloning, and evictIfNeeded enforces MAX_CACHED_REPOS via LRU eviction.

  4. Clone timeout + size capgit -c http.lowSpeedLimit=1000 -c http.lowSpeedTime=30 aborts stalled clones; --single-branch --no-tags keeps size down.

  5. Race fix — clone to <dir>.tmp.<timestamp>, then atomic rename to final path. Failed clones clean up the temp dir only.

Address Codex bot review feedback:
- outline: map query → path before calling handleOutline
- symbol: map query → name before calling handleSymbol
- read: parse lines '10-60' → line_start/line_end before calling handleRead

Add translateRemoteArgs() function and test coverage.
@idea404

idea404 commented Jun 8, 2026

Copy link
Copy Markdown
Author

Also addressed the Codex bot suggestions — translateRemoteArgs now maps query→path (outline), query→name (symbol), and lines "10-60"→line_start/line_end (read) before dispatching to local handlers.

Resolve test_mcp.zig conflict by keeping issue-534 remote-cache tests
alongside upstream release-branch tests (justrach#570justrach#592).

Co-authored-by: Cursor <cursoragent@cursor.com>
@idea404

idea404 commented Jun 10, 2026

Copy link
Copy Markdown
Author

@justrach quick bump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

codedb_remote: add local fallback when api.wiki.codes is unreachable

2 participants