[codex] consume pushed exec-server process events#30273
Conversation
66695cb to
005450e
Compare
Codex Cloud Agents (CCA) couldn't complete this review. The original Codex Review is unaffected. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 005450e87c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
|
|
||
| #[doc(hidden)] | ||
| pub fn from_events(events: Vec<ExecProcessEvent>) -> Self { |
There was a problem hiding this comment.
Avoid exposing a test-only event receiver constructor
Because ExecProcessEventReceiver is re-exported from codex-exec-server, this pub constructor becomes part of the crate’s API even though it is only used to feed canned events in tests. Please keep this behind test support or another non-public path so the production API does not grow with a hidden test helper.
AGENTS.md reference: AGENTS.md:L87-L89
Useful? React with 👍 / 👎.
| } = output_handles; | ||
| let process = started.process; | ||
| let mut wake_rx = process.subscribe_wake(); | ||
| let mut events = process.subscribe_events(); |
There was a problem hiding this comment.
Add integration coverage for pushed exec events
This switches unified exec from wake/read polling to consuming pushed process events, including lag and legacy fallback behavior, but the diff only adds mock-based unit coverage in process_tests.rs; no core/suite TestCodex integration test exercises the real agent tool path. Please add integration coverage for the new completion path so the actual session wiring and user-facing exec behavior are protected.
AGENTS.md reference: AGENTS.md:L114-L116
Useful? React with 👍 / 👎.
| let mut last_seq: u64 = 0; | ||
| loop { | ||
| match process | ||
| .read(after_seq, /*max_bytes*/ None, /*wait_ms*/ Some(0)) |
There was a problem hiding this comment.
After we get a wakeup, we call process.read which introduces an extra round trip
|
I would keep the acceptance boundary at the event stream, not at the latency win. The correctness question here is whether "complete" becomes observable only after every authority-relevant terminal fact has crossed the same ordered channel as stdout and stderr. If output, exit, close, and sandbox-denial state can arrive on different recovery paths, the caller can accidentally treat a process as successfully complete while the denial or retained output is still sitting behind a fallback read. The regression shape I would want is:
That gives the protocol a useful invariant: pushed completion is authoritative only when the ordered stream contains the full terminal state. The retained read is then a recovery path for lag, gaps, and legacy metadata, not a second source of truth that can silently change the meaning of a completed tool call after dispatch has already resumed. The performance gain is real, but the governance property is more important: the model should not continue from a remote exec result until output, exit, close, and denial state have converged into one terminal process record. Boundary: architecture and regression-test feedback only; no claim about using this project, running this branch, validating implementation behavior, implementation correctness, performance measurements, merge readiness, security review, production readiness, partnership, customer interest, official alignment, OpenAI usage, Codex usage, conformance certification, or Neura usage. |
Summary
process/readprocess/exitedprocess/readas a retained-output and compatibility fallback for receiver lag, sequence gaps, and legacy serversTestCodexremote-exec path without adding a public test-only event constructorWhy
A successful one-shot tool call currently receives its output and terminal notifications, then pays another wide-area
process/readround trip before returning. Staging traces showed that remote response wait accounted for more than 99.8% of RPC time; local serialization, queueing, and deserialization were below 0.6 ms.Measured impact
A direct staging A/B used the same build and route and changed only completion mode. Each arm ran three times with 30 one-shot
/usr/bin/truecalls per run. The table reports the median of the three per-run percentiles.process/readprocess/readRPC p50TCP_NODELAY was enabled in both A/B arms, so its effect cancels out. The successful, complete, in-order event path issued zero final
process/readcalls.Compatibility and recovery
sandboxDeniedonprocess/exitedprocess/readIntegration coverage
The
TestCodexsuite exercises four distinct remote-exec contracts:Validation
just test -p codex-core exec_command_consumes_pushed_remote_process_events: 4 passedjust test -p codex-core unified_exec::process_tests::: 4 passedjust test -p codex-exec-server: 294 passed, 2 skippedjust test -p codex-exec-server-protocol: 5 passedjust test -p codex-rmcp-client: 89 passed, 2 skipped//codex-rs/core:core-all-test: passed across 16 shardsjust fixpassed for core and exec-serverjust fmtpassedThe complete workspace suite was not rerun; focused Cargo and Bazel coverage passed for the changed behavior.