code-mode: decouple cell creation from observation#29290
Conversation
67fe0dd to
db85bd5
Compare
289b79d to
4439ee8
Compare
9ab809d to
d969450
Compare
| .runtime | ||
| .execute( | ||
| runtime_request(request), | ||
| .create_cell(runtime_request(request)) |
There was a problem hiding this comment.
This opens a cancellation hole between admitting the cell and registering its first observer
i.e. potential race
4439ee8 to
acb5941
Compare
d969450 to
9b1d9b8
Compare
acb5941 to
7dc6da8
Compare
9b1d9b8 to
1ad5213
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1ad521389a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| tokio::spawn(async move { | ||
| let response = started | ||
| .initial_event() | ||
| let response = pending_event | ||
| .event() | ||
| .await | ||
| .map_err(|error| error.to_string()) |
There was a problem hiding this comment.
Cancel the initial observer when the result is dropped
Because this detached task owns pending_event.event().await, dropping the returned StartedCell no longer drops the actor's initial observation. In flows where a caller clones the public cell_id and then cancels or discards initial_response, the cell still has an active YieldAfter observer until the first yield/completion, so subsequent wait/pending observations for that cell are rejected as busy instead of taking over the cell as before. Tie the actor future to the returned receiver, or abort/select it when that receiver is dropped, so cancellation releases the observer.
Useful? React with 👍 / 👎.
| .await | ||
| .map_err(|error| error.to_string())?; | ||
| let pending_event = self | ||
| .runtime | ||
| .begin_observe( |
There was a problem hiding this comment.
Guard admitted cells when execute is cancelled
If the execute future is cancelled after create_cell(...).await has admitted the runtime but before or while begin_observe completes, the caller never receives the cell_id and no initial observer or termination path is installed. In a cancellation/timeout path with a long-running cell, that leaves an unobservable cell running until session shutdown; keep admission plus the first observe atomic for execute, or add a cancellation guard that terminates the admitted cell before returning/dropping.
Useful? React with 👍 / 👎.
| runtime_request(request), | ||
| runtime::ObserveMode::PendingFrontier, | ||
| ) | ||
| .create_cell(runtime_request(request)) |
There was a problem hiding this comment.
create_cell() uses the continuing policy now, so a fast tool response can advance past the first pending frontier before this separate observe() is registered.
Can we create this cell with PauseAtPendingFrontier and cover that interleaving?
| if matches!(mode, ObserveMode::PendingFrontier) && pending_frontier_ready { | ||
| pending_frontier_ready = false; | ||
| match send_cell_event( | ||
| has_been_observed = true; |
There was a problem hiding this comment.
This marks the cell observed before anything is delivered. Isn't this strange?
Feel free to ignore if not
| let runtime_cell_id = self | ||
| .runtime | ||
| .execute( | ||
| .create_cell_with_execution_policy( |
There was a problem hiding this comment.
This makes normal execute cells pause after their first yield... once the initial observer is gone, the next pending frontier stops the runtime, so async work cannot advance until wait attaches
| runtime_request(request), | ||
| runtime::ObserveMode::PendingFrontier, | ||
| ) | ||
| .create_cell(runtime_request(request)) |
There was a problem hiding this comment.
I think this has the inverse race: create_cell is continuing, so the runtime can run past its first pending frontier before observe(PendingFrontier) is dequeued
Why
Cell admission, observation, and runtime scheduling are separate concerns. A cell may complete or reach a pending frontier before an observer attaches, and continuing execution should not pause merely because an observation future is absent.
What
SessionRuntimecreate a cell without attaching an observer.ContinueWhenUnblockedas the default and a separate pause-at-pending-frontier policy.Stack boundary
This PR changes the transport-neutral runtime beneath the session protocol. The host/core
CodeModeSessionsurface still exposes the existingexecute/waitAPI here; explicit protocol-levelcreate_cell/observearrives in #29291. Generation-checked pending resume arrives in #29399.Validation
just test -p codex-code-mode