Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,20 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
already attached to the old daemon keep using it while new sessions run
standalone until it idles out — they never mix versions over the socket.

### Fixed
- **The file watcher no longer exhausts the OS file-watch budget on large
repos (#276).** It used to register a recursive watch over the *entire*
project — `node_modules/`, build output, caches and all — and filter only
after the fact. On Linux that meant hundreds of thousands of inotify watches
per project; enough that a second project, or codegraph alongside your editor
/ `next dev`, could hit the per-user ceiling and fail with "OS file watch
limit reached." The watcher now excludes the same directories the indexer
ignores (the built-in default-ignore set **plus** your `.gitignore`) *before*
registering a watch — so on a repo with a 900-directory `node_modules` the
watch count drops from ~1,200 to ~14, even when the project has no
`.gitignore`. (Stacks with the shared daemon from #411: one watcher across
agents, and now that watcher is small.)

## [0.9.5] - 2026-05-25

### Fixed
Expand Down
30 changes: 30 additions & 0 deletions __tests__/watcher.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,36 @@ describe('FileWatcher', () => {

watcher.stop();
});

it('should not watch node_modules even without a .gitignore (#276/#417)', async () => {
// No .gitignore in testDir — exclusion relies on the built-in
// default-ignore set the indexer uses (buildDefaultIgnore), which a
// .gitignore-only filter would miss.
fs.mkdirSync(path.join(testDir, 'node_modules', 'dep', 'lib'), { recursive: true });
fs.writeFileSync(path.join(testDir, 'node_modules', 'dep', 'index.ts'), 'export const dep = 1;');

const syncFn = vi.fn().mockResolvedValue({ filesChanged: 0, durationMs: 0 });
const watcher = new FileWatcher(testDir, syncFn, { debounceMs: 200 });
watcher.start();

// Let the watcher settle past any residual crawl events.
await new Promise((r) => setTimeout(r, 400));
syncFn.mockClear();

// A source-extension edit INSIDE node_modules must NOT trigger a sync —
// the directory was never watched.
fs.writeFileSync(path.join(testDir, 'node_modules', 'dep', 'lib', 'extra.ts'), 'export const e = 2;');
await new Promise((r) => setTimeout(r, 600));
expect(syncFn).not.toHaveBeenCalled();

// Positive control: a real source edit still triggers sync, proving the
// watcher is live (not merely inert).
fs.writeFileSync(path.join(testDir, 'src', 'live.ts'), 'export const live = 3;');
await waitFor(() => syncFn.mock.calls.length > 0, 5000);
expect(syncFn).toHaveBeenCalled();

watcher.stop();
});
});

describe('callbacks', () => {
Expand Down
33 changes: 31 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
"license": "MIT",
"dependencies": {
"@clack/prompts": "^1.3.0",
"chokidar": "^4.0.3",
"commander": "^14.0.2",
"fast-string-width": "^3.0.2",
"fast-wrap-ansi": "^0.2.0",
Expand Down
2 changes: 1 addition & 1 deletion src/extraction/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ const DEFAULT_IGNORE_PATTERNS: string[] = [
* the defaults apply to tracked files too (committing a dependency dir doesn't make
* it project code; the explicit `.gitignore` negation is the only opt-in).
*/
function buildDefaultIgnore(rootDir: string): Ignore {
export function buildDefaultIgnore(rootDir: string): Ignore {
const ig = ignore().add(DEFAULT_IGNORE_PATTERNS);
try {
const rootGitignore = path.join(rootDir, '.gitignore');
Expand Down
74 changes: 40 additions & 34 deletions src/mcp/daemon.ts
Original file line number Diff line number Diff line change
Expand Up @@ -280,55 +280,61 @@ export type AcquireResult =
| { kind: 'taken'; existing: DaemonLockInfo | null; pidPath: string };

/**
* Atomically create the daemon pidfile AND write its full record in the same
* call. Returns either an `acquired` result (the caller is now the daemon-elect
* and may construct a {@link Daemon}) or a `taken` result.
* Atomically create the daemon pidfile with its full record already in place.
* Returns either an `acquired` result (the caller is the daemon-elect and may
* construct a {@link Daemon}) or a `taken` result.
*
* must-fix 1 (issue #411 review): the original implementation created the
* pidfile empty under an `O_EXCL` fd and only wrote the body later, after
* `server.listen` resolved. A second candidate that read the pidfile during
* that millisecond-wide window saw an empty file, decoded it as `null`, treated
* it as stale, and `unlink`'d the lock the first daemon still held — producing
* two daemons (two watchers, two writers) on concurrent startup, exactly the
* multi-agent scenario the feature targets. Writing the complete record before
* returning the handle closes that window: a concurrent reader always sees a
* valid pid+version+socketPath, never an empty file. The socket path is
* deterministic from the project root, so it's known here.
* must-fix 1 (issue #411 review): the lockfile must appear in ONE atomic step,
* already complete — never empty, even momentarily. The first attempt at this
* (`O_EXCL` create then a separate `writeSync`) left a microsecond window where
* the file existed but was empty; under concurrent daemon startup a third
* candidate could read that empty file, decode it as `null`, and `unlink` the
* winner's lock → two daemons (two watchers, two writers). The window was
* normally too small to hit, but the chokidar watcher's extra startup time made
* concurrent daemons overlap enough to reproduce it reliably.
*
* The fix writes the complete record to a private temp file, then hard-links it
* into place: `link()` is atomic AND exclusive (EEXIST if the target exists), so
* the pidfile becomes visible in one step already containing a full record.
* Whoever links first wins; everyone else gets EEXIST and reads a complete file.
* There is no empty-file window at all.
*/
export function tryAcquireDaemonLock(projectRoot: string): AcquireResult {
const pidPath = getDaemonPidPath(projectRoot);
// Make sure the .codegraph/ directory exists — the daemon may be the first
// thing to touch it on a fresh-clone-but-already-initialized checkout.
fs.mkdirSync(path.dirname(pidPath), { recursive: true });

const info: DaemonLockInfo = {
pid: process.pid,
version: CodeGraphPackageVersion,
socketPath: getDaemonSocketPath(projectRoot),
startedAt: Date.now(),
};

// Temp name is pid-scoped so racing candidates never collide on it.
const tmp = `${pidPath}.${process.pid}.tmp`;
let acquired = false;
try {
// `wx` = O_CREAT | O_EXCL | O_WRONLY: atomic "create only if absent".
const fd = fs.openSync(pidPath, 'wx', 0o600);
const info: DaemonLockInfo = {
pid: process.pid,
version: CodeGraphPackageVersion,
socketPath: getDaemonSocketPath(projectRoot),
startedAt: Date.now(),
};
fs.writeFileSync(tmp, encodeLockInfo(info), { mode: 0o600 });
try {
// Synchronous write immediately after the create — no await in between —
// so the empty-file window is a single fs.writeSync, not an I/O-bound
// `server.listen`. Combined with the pid-verified `clearStaleDaemonLock`
// below, concurrent candidates can never delete a live daemon's lock.
fs.writeSync(fd, encodeLockInfo(info));
} finally {
fs.closeSync(fd);
fs.linkSync(tmp, pidPath); // atomic + exclusive
acquired = true;
} catch (err: unknown) {
if ((err as NodeJS.ErrnoException).code !== 'EEXIST') throw err;
}
return { kind: 'acquired', pidPath, info };
} catch (err: unknown) {
const e = err as NodeJS.ErrnoException;
if (e.code !== 'EEXIST') throw err;
} finally {
try { fs.unlinkSync(tmp); } catch { /* temp already gone */ }
}

if (acquired) return { kind: 'acquired', pidPath, info };

// Taken. Because the pidfile was link'd atomically it always holds a complete
// record — `existing` is null only for a genuinely corrupt leftover, never a
// mid-write race.
let existing: DaemonLockInfo | null = null;
try {
const raw = fs.readFileSync(pidPath, 'utf8');
existing = decodeLockInfo(raw);
existing = decodeLockInfo(fs.readFileSync(pidPath, 'utf8'));
} catch { /* unreadable lockfile — treat as malformed */ }
return { kind: 'taken', existing, pidPath };
}
Expand Down
Loading