Skip to content

code-mode: decouple cell creation from observation#29290

Closed
cconger wants to merge 2 commits into
mainfrom
cconger/code-mode-runtime-compact-03f-create-observe-runtime
Closed

code-mode: decouple cell creation from observation#29290
cconger wants to merge 2 commits into
mainfrom
cconger/code-mode-runtime-compact-03f-create-observe-runtime

Conversation

@cconger

@cconger cconger commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Why

Cell admission, observation, and runtime scheduling are separate concerns. A cell may complete or reach a pending frontier before an observer attaches, and continuing execution should not pause merely because an observation future is absent.

What

  • Let SessionRuntime create a cell without attaching an observer.
  • Buffer completion, pending frontiers, and the first pre-observation yield until an observation arrives.
  • Preserve the initial yield split when delivery races with cancellation.
  • Make execution policy independent of observation mode, with ContinueWhenUnblocked as the default and a separate pause-at-pending-frontier policy.
  • Keep termination atomic with respect to session-store commits.

Stack boundary

This PR changes the transport-neutral runtime beneath the session protocol. The host/core CodeModeSession surface still exposes the existing execute/wait API here; explicit protocol-level create_cell/observe arrives in #29291. Generation-checked pending resume arrives in #29399.

Validation

  • just test -p codex-code-mode
  • Regressions cover delayed first observation, completion before observation, canceled delivery, store-commit cancellation, and continuing versus pausable execution.

@cconger cconger force-pushed the cconger/code-mode-runtime-compact-03f-create-observe-runtime branch from 67fe0dd to db85bd5 Compare June 21, 2026 05:33
@cconger cconger force-pushed the cconger/code-mode-runtime-compact-03e3-initial-yield-boundary branch from 289b79d to 4439ee8 Compare June 21, 2026 06:24
@cconger cconger force-pushed the cconger/code-mode-runtime-compact-03f-create-observe-runtime branch 2 times, most recently from 9ab809d to d969450 Compare June 21, 2026 07:23
Comment thread codex-rs/code-mode/src/service.rs Outdated
.runtime
.execute(
runtime_request(request),
.create_cell(runtime_request(request))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This opens a cancellation hole between admitting the cell and registering its first observer
i.e. potential race

@cconger cconger force-pushed the cconger/code-mode-runtime-compact-03e3-initial-yield-boundary branch from 4439ee8 to acb5941 Compare June 21, 2026 19:24
@cconger cconger force-pushed the cconger/code-mode-runtime-compact-03f-create-observe-runtime branch from d969450 to 9b1d9b8 Compare June 21, 2026 19:24
@cconger cconger force-pushed the cconger/code-mode-runtime-compact-03e3-initial-yield-boundary branch from acb5941 to 7dc6da8 Compare June 21, 2026 20:55
Base automatically changed from cconger/code-mode-runtime-compact-03e3-initial-yield-boundary to main June 21, 2026 22:35
@cconger cconger force-pushed the cconger/code-mode-runtime-compact-03f-create-observe-runtime branch from 9b1d9b8 to 1ad5213 Compare June 21, 2026 22:44
@cconger cconger marked this pull request as ready for review June 21, 2026 22:55
@cconger cconger closed this Jun 21, 2026
@cconger cconger reopened this Jun 21, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1ad521389a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 108 to 112
tokio::spawn(async move {
let response = started
.initial_event()
let response = pending_event
.event()
.await
.map_err(|error| error.to_string())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cancel the initial observer when the result is dropped

Because this detached task owns pending_event.event().await, dropping the returned StartedCell no longer drops the actor's initial observation. In flows where a caller clones the public cell_id and then cancels or discards initial_response, the cell still has an active YieldAfter observer until the first yield/completion, so subsequent wait/pending observations for that cell are rejected as busy instead of taking over the cell as before. Tie the actor future to the returned receiver, or abort/select it when that receiver is dropped, so cancellation releases the observer.

Useful? React with 👍 / 👎.

Comment on lines +95 to +99
.await
.map_err(|error| error.to_string())?;
let pending_event = self
.runtime
.begin_observe(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard admitted cells when execute is cancelled

If the execute future is cancelled after create_cell(...).await has admitted the runtime but before or while begin_observe completes, the caller never receives the cell_id and no initial observer or termination path is installed. In a cancellation/timeout path with a long-running cell, that leaves an unobservable cell running until session shutdown; keep admission plus the first observe atomic for execute, or add a cancellation guard that terminates the admitted cell before returning/dropping.

Useful? React with 👍 / 👎.

runtime_request(request),
runtime::ObserveMode::PendingFrontier,
)
.create_cell(runtime_request(request))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_cell() uses the continuing policy now, so a fast tool response can advance past the first pending frontier before this separate observe() is registered.

Can we create this cell with PauseAtPendingFrontier and cover that interleaving?

if matches!(mode, ObserveMode::PendingFrontier) && pending_frontier_ready {
pending_frontier_ready = false;
match send_cell_event(
has_been_observed = true;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This marks the cell observed before anything is delivered. Isn't this strange?
Feel free to ignore if not

let runtime_cell_id = self
.runtime
.execute(
.create_cell_with_execution_policy(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes normal execute cells pause after their first yield... once the initial observer is gone, the next pending frontier stops the runtime, so async work cannot advance until wait attaches

runtime_request(request),
runtime::ObserveMode::PendingFrontier,
)
.create_cell(runtime_request(request))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has the inverse race: create_cell is continuing, so the runtime can run past its first pending frontier before observe(PendingFrontier) is dequeued

@cconger cconger closed this Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants