fix: native host cleanup#204
Conversation
🦋 Changeset detectedLatest commit: 68b26e9 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
WalkthroughIntroduces owner-token-based session tracking for the OpenCode native messaging host with disk persistence, adds health probes to detect unowned OpenCode servers, and refactors cleanup flows to only terminate managed or clearly OpenCode-like processes. ChangesNative host ownership tracking and safer process cleanup
Sequence DiagramsequenceDiagram
participant NativeHost
participant StateDir as State Directory
participant SessionMeta as Session Metadata
participant HealthProbe as /global/health
NativeHost->>StateDir: getOrCreateNativeHostOwnerToken
StateDir-->>NativeHost: ownerToken (long-lived)
NativeHost->>StateDir: restorePersisstedManagedPids
StateDir-->>NativeHost: managed PIDs for this token
Note over NativeHost: On spawn
NativeHost->>SessionMeta: persist {port, pid, ownerToken, updatedAt}
Note over NativeHost: On cleanup check
NativeHost->>HealthProbe: probe port for server
HealthProbe-->>NativeHost: server detected or not
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
packages/cli/src/commands/opencode/implementation.tsOops! Something went wrong! :( ESLint: 9.39.4 TypeError: Key "languageOptions": Key "parser": Expected object with parse() or parseForESLint() method. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/cli/src/commands/opencode/implementation.ts (1)
1394-1423:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSerialize native-host commands instead of fire-and-forget async handlers.
handleMessagenow awaits long-runningstartServer/restartServer, but the parser still dispatches each message withvoid handleMessage(msg). If Chrome sendsstart,restart, orstopback-to-back, those handlers can race onserverProc,managedSessions, andmanagedPids, leading to out-of-order status messages and killing the wrong process.Suggested queuing approach
+ let messageQueue = Promise.resolve(); + process.stdin.on('data', (chunk) => { logger.log('Received data chunk', { length: chunk.length }); ... try { const msg = JSON.parse(messageData.toString()); - void handleMessage(msg); + messageQueue = messageQueue + .then(() => handleMessage(msg)) + .catch((err) => { + logger.error('Queued message handling failed', err); + sendMessage({ status: 'error', message: 'Internal error processing message' }); + }); } catch (err) { logger.error('Failed to parse JSON message', err); }Also applies to: 1514-1554
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/cli/src/commands/opencode/implementation.ts` around lines 1394 - 1423, handleMessage is being invoked fire-and-forget (void handleMessage(msg)) while it now awaits long-running startServer/restartServer, causing races on shared state (serverProc, managedSessions, managedPids) when Chrome sends commands back-to-back; fix by serializing native-host command handling: replace the fire-and-forget dispatch with a single-message queue or a sequential processor that ensures only one handler runs at a time (e.g., a simple FIFO Promise queue or an async loop) and have the parser enqueue messages instead of calling handleMessage directly; ensure startServer, restartServer and stop handlers are consumed from the queue in order and that errors are caught and reported back before dequeuing the next message so status messages remain ordered.
🧹 Nitpick comments (1)
.changeset/few-results-kneel.md (1)
5-5: ⚡ Quick winConsider a more descriptive changelog entry.
The entry "fix: adjusted the native host process cleanup" is brief. Users reading the changelog would benefit from understanding what was adjusted and why. Consider adding details like "prevents terminating unmanaged processes" or "adds ownership tracking to avoid killing unrelated OpenCode servers."
📝 Example of a more descriptive entry
-fix: adjusted the native host process cleanup +fix: improved native host process cleanup to only terminate managed or OpenCode-owned processesor
-fix: adjusted the native host process cleanup +fix: added ownership tracking to prevent native host from terminating unrelated OpenCode servers during cleanup🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.changeset/few-results-kneel.md at line 5, Replace the terse changelog header "fix: adjusted the native host process cleanup" with a more descriptive entry explaining what was changed and why; update the .changeset entry that contains that exact header to something like "fix: avoid terminating unmanaged native host processes by adding ownership tracking to native host cleanup" or "fix: add ownership checks to native host process cleanup to prevent killing unrelated OpenCode servers," and include one sentence describing the user-facing impact and the root cause addressed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/cli/src/commands/opencode/implementation.ts`:
- Around line 742-757: The owner token is being persisted in cleartext at
NATIVE_HOST_OWNER_TOKEN_PATH and (per review) additionally logged; update
getOrCreateNativeHostOwnerToken to store/retrieve the token from the CLI's
secure token storage location (use the project's secure token API/utility or the
designated secure token path instead of
NATIVE_HOST_STATE_DIR/NATIVE_HOST_OWNER_TOKEN_PATH), remove any debugging or
logging that prints the raw token value (log only presence/creation events or a
redacted token), and apply the same changes to the duplicated logic referenced
around lines 1157-1160 so both code paths use secure storage and never emit the
token in logs.
- Around line 495-507: The persisted PID ownership from loadPersistedManagedPids
can outlive the original process and cause killProcessOnPort to kill unrelated
recycled PIDs; update the persistence/validation logic so ownedPersistedPids are
only trusted after validating a stable process identity or TTL. Change the
session-*.json format (used by loadPersistedManagedPids and cleanup) to store a
process startTime or unique process token (e.g., PID+startTime or a processUUID)
and when killProcessOnPort (and other callers) loads ownedPersistedPids, verify
each entry by checking the OS process start time or matching the stored process
token and/or rejecting entries older than a configurable TTL, and ensure cleanup
prunes stale session files when validation fails; adjust allowedOwnerToken
handling so only validated, live owners are considered.
---
Outside diff comments:
In `@packages/cli/src/commands/opencode/implementation.ts`:
- Around line 1394-1423: handleMessage is being invoked fire-and-forget (void
handleMessage(msg)) while it now awaits long-running startServer/restartServer,
causing races on shared state (serverProc, managedSessions, managedPids) when
Chrome sends commands back-to-back; fix by serializing native-host command
handling: replace the fire-and-forget dispatch with a single-message queue or a
sequential processor that ensures only one handler runs at a time (e.g., a
simple FIFO Promise queue or an async loop) and have the parser enqueue messages
instead of calling handleMessage directly; ensure startServer, restartServer and
stop handlers are consumed from the queue in order and that errors are caught
and reported back before dequeuing the next message so status messages remain
ordered.
---
Nitpick comments:
In @.changeset/few-results-kneel.md:
- Line 5: Replace the terse changelog header "fix: adjusted the native host
process cleanup" with a more descriptive entry explaining what was changed and
why; update the .changeset entry that contains that exact header to something
like "fix: avoid terminating unmanaged native host processes by adding ownership
tracking to native host cleanup" or "fix: add ownership checks to native host
process cleanup to prevent killing unrelated OpenCode servers," and include one
sentence describing the user-facing impact and the root cause addressed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 80ef3537-a0a5-433c-ae9c-0ca90b77feaf
📒 Files selected for processing (2)
.changeset/few-results-kneel.mdpackages/cli/src/commands/opencode/implementation.ts
| function killProcessOnPort( | ||
| port: number, | ||
| allowedPids: Set<number>, | ||
| logger?: { log: (msg: string, data?: any) => void; error: (msg: string, err?: any) => void }, | ||
| options?: { allowUnmanagedOpencode?: boolean; allowedOwnerToken?: string }, | ||
| ): boolean { | ||
| const logInfo = logger?.log ?? ((msg: string) => { /* silent */ }); | ||
| const logError = logger?.error ?? ((msg: string) => { /* silent */ }); | ||
| const allowUnmanagedOpencode = options?.allowUnmanagedOpencode !== false; | ||
| const allowedOwnerToken = options?.allowedOwnerToken?.trim().toLowerCase(); | ||
| const ownedPersistedPids = allowedOwnerToken | ||
| ? loadPersistedManagedPids(allowedOwnerToken) | ||
| : new Set<number>(); |
There was a problem hiding this comment.
Restored PID ownership can outlive the original process.
loadPersistedManagedPids trusts every session-*.json entry for the owner token, but cleanup only removes metadata for sessions created by the current process. After a crash or reconnect, stale files survive indefinitely; once the OS recycles one of those PIDs, killProcessOnPort will treat an unrelated listener as owned and force-kill it. Bind persisted ownership to a stable process identity or TTL before reusing it.
Also applies to: 781-812, 1154-1160, 1475-1487
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/cli/src/commands/opencode/implementation.ts` around lines 495 - 507,
The persisted PID ownership from loadPersistedManagedPids can outlive the
original process and cause killProcessOnPort to kill unrelated recycled PIDs;
update the persistence/validation logic so ownedPersistedPids are only trusted
after validating a stable process identity or TTL. Change the session-*.json
format (used by loadPersistedManagedPids and cleanup) to store a process
startTime or unique process token (e.g., PID+startTime or a processUUID) and
when killProcessOnPort (and other callers) loads ownedPersistedPids, verify each
entry by checking the OS process start time or matching the stored process token
and/or rejecting entries older than a configurable TTL, and ensure cleanup
prunes stale session files when validation fails; adjust allowedOwnerToken
handling so only validated, live owners are considered.
| function getOrCreateNativeHostOwnerToken(): string { | ||
| ensureDirectoryExists(NATIVE_HOST_STATE_DIR); | ||
| try { | ||
| if (fs.existsSync(NATIVE_HOST_OWNER_TOKEN_PATH)) { | ||
| const existing = fs.readFileSync(NATIVE_HOST_OWNER_TOKEN_PATH, 'utf8').trim(); | ||
| if (existing) { | ||
| return existing; | ||
| } | ||
| } | ||
| } catch { | ||
| // fall through and recreate | ||
| } | ||
|
|
||
| const token = randomUUID(); | ||
| fs.writeFileSync(NATIVE_HOST_OWNER_TOKEN_PATH, token, 'utf8'); | ||
| return token; |
There was a problem hiding this comment.
Owner token is persisted and logged in cleartext.
This token is now the capability that decides which servers are treated as "owned", but the change writes it to owner-token.txt and echoes the raw value into the debug log. Please move it to the CLI's secure token storage path and stop logging the token itself.
As per coding guidelines "packages/{cli,core}/src/**/*.{ts,tsx}: Implement secure storage for token management".
Also applies to: 1157-1160
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/cli/src/commands/opencode/implementation.ts` around lines 742 - 757,
The owner token is being persisted in cleartext at NATIVE_HOST_OWNER_TOKEN_PATH
and (per review) additionally logged; update getOrCreateNativeHostOwnerToken to
store/retrieve the token from the CLI's secure token storage location (use the
project's secure token API/utility or the designated secure token path instead
of NATIVE_HOST_STATE_DIR/NATIVE_HOST_OWNER_TOKEN_PATH), remove any debugging or
logging that prints the raw token value (log only presence/creation events or a
redacted token), and apply the same changes to the duplicated logic referenced
around lines 1157-1160 so both code paths use secure storage and never emit the
token in logs.
Summary by CodeRabbit