Skip to content

fix: prevent stdout/stderr interleaving in mcp_cli_bridge for large outputs#40037

Merged
pelikhan merged 5 commits into
mainfrom
copilot/cli-tools-test-fix-json-output
Jun 18, 2026
Merged

fix: prevent stdout/stderr interleaving in mcp_cli_bridge for large outputs#40037
pelikhan merged 5 commits into
mainfrom
copilot/cli-tools-test-fix-json-output

Conversation

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

On Linux, process.stdout.write() on a pipe writes at most ~64 KiB synchronously (OS pipe buffer), buffers the rest in Node.js, and returns false. Any synchronous process.stderr.write() that follows (keepalive pings via core.info) completes immediately — before the buffered stdout tail drains — injecting [info] lines mid-JSON at exactly offset ~65,536. Callers capturing combined output (e.g. the AWF bash tool with 2>&1) see corrupted JSON.

Changes

mcp_cli_bridge.cjs

  • writeStdoutAndFlush(data) — new helper that registers paired drain and error listeners (each removing the other before settling), so the Promise rejects cleanly on EPIPE/broken-pipe instead of hanging indefinitely
  • formatResponse — made async; all process.stdout.write() calls replaced with await writeStdoutAndFlush()
  • main() — stops the keepalive timer before writing output, then awaits all output calls; eliminates any possibility of a ping racing the final write
function writeStdoutAndFlush(data) {
  return new Promise((resolve, reject) => {
    const flushed = process.stdout.write(data);
    if (flushed) {
      resolve();
    } else {
      const onDrain = () => { process.stdout.removeListener("error", onError); resolve(); };
      const onError = err => { process.stdout.removeListener("drain", onDrain); reject(err); };
      process.stdout.once("drain", onDrain);
      process.stdout.once("error", onError);
    }
  });
}

mcp_cli_bridge.test.cjs

  • Three existing formatResponse tests updated to async/await
  • Four new writeStdoutAndFlush tests: immediate resolve (no backpressure), drain-wait (buffer full), ordering guarantee (stdout fully drained before any core.info call), and EPIPE rejection (stdout emits error while waiting for drain)
  • onceStub.mockRestore() calls in drain tests wrapped in try/finally to guarantee cleanup even when assertions throw

Workflow lockfiles

  • Four *.lock.yml files regenerated via make recompile (adds shell(awk) / Bash(awk) to Copilot allowlists); these changes are incidental to the bridge fix

Copilot AI and others added 2 commits June 18, 2026 12:07
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…utputs

When process.stdout.write() on a pipe returns false (OS pipe buffer ~64 KiB
full), Node.js buffers the remaining data. Any subsequent synchronous
process.stderr.write() calls (via core.info keepalive pings) complete
immediately and reach the OS before the buffered stdout tail flushes,
corrupting JSON output with interleaved [info] log lines at ~65,536 bytes.

Fix:
- Add writeStdoutAndFlush() helper that awaits 'drain' event if stdout.write
  returns false (backpressure), ensuring full flush before continuing
- Stop keepalive timer BEFORE writing output so no pings fire during writes
- Make formatResponse() async and await all stdout writes
- Update main() to await formatResponse/writeStdoutAndFlush calls

Closes #40008

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix JSON output corruption in status tool fix: prevent stdout/stderr interleaving in mcp_cli_bridge for large outputs Jun 18, 2026
Copilot AI requested a review from pelikhan June 18, 2026 12:18
@pelikhan pelikhan marked this pull request as ready for review June 18, 2026 12:22
Copilot AI review requested due to automatic review settings June 18, 2026 12:22
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #40037 does not have the 'implementation' label and has 0 new lines of code in business logic directories (≤100 threshold).

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Ready to approve

The flushing/backpressure fix is correctly implemented, call sites are updated to await it, and tests cover the drain-ordering regression path (only a minor PR-description/lockfile-scope nit remains).

Note: this review does not count toward required approvals for merging.

Pull request overview

This PR hardens the mcp_cli_bridge Node.js wrapper against stdout/stderr interleaving when emitting large payloads to a piped stdout (notably on Linux), which can corrupt captured JSON when stderr log lines land mid-write.

Changes:

  • Added writeStdoutAndFlush() and converted formatResponse() to async so stdout writes await backpressure (drain) before any stderr logging continues.
  • Updated main() to stop MCP keepalive pings before writing output and to await all output writes.
  • Updated/added Vitest coverage for backpressure + ordering; plus a small install script sync-note clarification and several regenerated workflow lockfile allowlist entries.
File summaries
File Description
actions/setup/js/mcp_cli_bridge.cjs Adds writeStdoutAndFlush, makes output formatting async, and stops keepalive before final output to prevent interleaving.
actions/setup/js/mcp_cli_bridge.test.cjs Converts relevant tests to async and adds targeted tests for backpressure/drain ordering.
actions/setup-cli/install.sh Clarifies which install script is canonical and that this file is copied/generated.
.github/workflows/spec-librarian.lock.yml Regenerated lockfile content adding shell(awk) to the allowlist.
.github/workflows/go-fan.lock.yml Regenerated lockfile content adding Bash(awk) to the allowlist.
.github/workflows/daily-testify-uber-super-expert.lock.yml Regenerated lockfile content adding shell(awk) to the allowlist.
.github/workflows/daily-file-diet.lock.yml Regenerated lockfile content adding shell(awk) to the allowlist.

Copilot's findings

  • Files reviewed: 7/7 changed files
  • Comments generated: 1

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 824 to 830
# Copilot CLI tool arguments (sorted):
# --allow-tool github
# --allow-tool safeoutputs
# --allow-tool serena
# --allow-tool shell(awk)
# --allow-tool shell(cat pkg/**/*.go)
# --allow-tool shell(cat pkg/*/*.go)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lockfile changes are intentional — they are generated by make recompile as required by the repository's workflow authoring conventions after any workflow markdown change. The four *.lock.yml diffs (shell(awk) / Bash(awk) allowlist additions) are unrelated to the bridge fix but are the correct result of compiling the current workflow sources.

@github-actions github-actions Bot mentioned this pull request Jun 18, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /diagnose and /tdd — requesting changes on one correctness gap and two test-hygiene issues.

📋 Key Themes & Highlights

Issues

  • Missing reject path in writeStdoutAndFlush — a broken pipe (EPIPE) while waiting for drain will cause the Promise to hang indefinitely; the process never exits cleanly
  • Two onceStub.mockRestore() calls not guarded by try/finally — a failing assertion before either call leaves process.stdout.once permanently spied upon, potentially poisoning later tests in the same run
  • No EPIPE/error test — the missing reject path would not be caught by the test suite as written

Positive Highlights

  • ✅ Excellent root-cause analysis in the PR body — the 64 KiB pipe-buffer boundary and the exact interleaving mechanism are precisely described
  • stopKeepalive?.() moved before any output write is a smart defence-in-depth measure that eliminates the race even before writeStdoutAndFlush is invoked
  • ✅ The ordering-guarantee test (formatResponse awaits stdout drain before writing to stderr) directly reproduces the production failure scenario — this is exactly the kind of regression test that /diagnose calls for
  • ✅ All three formatResponse call sites uniformly upgraded to await writeStdoutAndFlush() — no forgotten writes

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer

Comment thread actions/setup/js/mcp_cli_bridge.cjs Outdated
if (flushed) {
resolve();
} else {
process.stdout.once("drain", resolve);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] writeStdoutAndFlush has no reject path — if stdout emits an error event while waiting for drain (e.g. a broken pipe / EPIPE), the Promise never settles and the process hangs indefinitely.

💡 Suggested fix

Add an error listener that rejects the promise, and clean up both listeners once either event fires:

function writeStdoutAndFlush(data) {
  return new Promise((resolve, reject) => {
    const flushed = process.stdout.write(data);
    if (flushed) {
      resolve();
    } else {
      const cleanup = () => {
        process.stdout.removeListener("drain", onDrain);
        process.stdout.removeListener("error", onError);
      };
      const onDrain = () => { cleanup(); resolve(); };
      const onError = (err) => { cleanup(); reject(err); };
      process.stdout.once("drain", onDrain);
      process.stdout.once("error", onError);
    }
  });
}

Broken-pipe errors are rare in CI but real when a consumer exits early. Without a reject path the process silently stalls.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit a0b7c3ewriteStdoutAndFlush now registers both drain and error listeners, each removing the other before settling the Promise. No more possibility of an indefinite hang on EPIPE.

expect(resolved).toBe(true);
expect(stdoutChunks).toContain("large payload\n");

onceStub.mockRestore();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] onceStub.mockRestore() is not guarded — if any expect(...) above it throws, the spy is never cleaned up and will corrupt the process.stdout.once behaviour for all subsequent tests.

💡 Use try/finally
const onceStub = vi.spyOn(process.stdout, "once").mockImplementation(...);
try {
  // ... test body ...
} finally {
  onceStub.mockRestore();
}

This pattern appears in two tests in this describe block (here and in formatResponse awaits stdout drain). Alternatively, register the spy in beforeEach and restore it in afterEach alongside the existing stdoutSpy.mockRestore() call.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — onceStub.mockRestore() is now inside a try/finally block in this test.

expect(stdoutIdx).toBeGreaterThanOrEqual(0);
expect(infoIdx).toBeGreaterThan(stdoutIdx);

onceStub.mockRestore();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] Same onceStub.mockRestore() cleanup issue here — if expect(infoIdx).toBeGreaterThan(stdoutIdx) fails, the spy leaks into subsequent tests.

💡 Use try/finally
try {
  // assertions...
  expect(stdoutIdx).toBeGreaterThanOrEqual(0);
  expect(infoIdx).toBeGreaterThan(stdoutIdx);
} finally {
  onceStub.mockRestore();
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — onceStub.mockRestore() is now inside a try/finally block in this test as well.

// writeStdoutAndFlush should resolve synchronously in this case.
await writeStdoutAndFlush("hello world\n");

expect(stdoutChunks[0]).toBe("hello world\n");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The three new tests cover the happy-path and ordering guarantee well, but there is no test for the error path: what happens when stdout emits error (EPIPE / broken pipe) while the Promise is waiting for drain?

💡 Suggested test skeleton
it("rejects when stdout emits error while waiting for drain", async () => {
  stdoutSpy.mockImplementation(chunk => {
    stdoutChunks.push(String(chunk));
    return false; // backpressure
  });
  const error = new Error("EPIPE");
  const onceStub = vi.spyOn(process.stdout, "once").mockImplementation((event, cb) => {
    if (event === "error") {
      // Fire asynchronously to simulate the stream erroring
      Promise.resolve().then(() => cb(error));
    }
    return process.stdout;
  });
  try {
    await expect(writeStdoutAndFlush("data\n")).rejects.toThrow("EPIPE");
  } finally {
    onceStub.mockRestore();
  }
});

Without this test the current implementation's missing reject path (see the companion comment on mcp_cli_bridge.cjs) would go undetected by the test suite.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added — "rejects when stdout emits error while waiting for drain (EPIPE)" test verifies the rejection path and asserts the error callback is captured before firing it.

@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot run pr-finisher skill

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core fix is sound — stopping the keepalive timer before emitting stdout correctly eliminates the race between large buffered writes and async keepalive pings. The writeStdoutAndFlush helper and the async formatResponse refactor both look correct for the stated goal.

Findings summary (2 non-blocking)

Medium — missing error rejection in writeStdoutAndFlush

process.stdout.once("drain", resolve) is registered but there is no corresponding "error" listener. If stdout enters an error state before draining, the Promise hangs and await writeStdoutAndFlush(...) blocks forever. The drain listener also leaks since it is never removed on an error path. In practice Node.js exits on EPIPE for the default process.stdout pipe, so this is low-risk today — but the same helper could be reused or tested against a custom Writable, at which point the missing reject path becomes a hard hang with no timeout.

Medium — test mock cleanup not guarded by try/finally

Both writeStdoutAndFlush drain tests call onceStub.mockRestore() unconditionally at the end. If any assertion above that line throws, the spy on process.stdout.once is never restored, corrupting the isolation of subsequent tests. Should be wrapped in try/finally or handled via onTestFinished/a nested afterEach.

🔎 Code quality review by PR Code Quality Reviewer

Comment thread actions/setup/js/mcp_cli_bridge.cjs Outdated
if (flushed) {
resolve();
} else {
process.stdout.once("drain", resolve);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeStdoutAndFlush Promise never rejects on stdout error — if process.stdout emits error instead of drain (e.g., broken pipe in a non-default environment), this Promise hangs indefinitely and await writeStdoutAndFlush(...) blocks the process forever. The drain listener is also never removed in that case.

💡 Suggested fix

Register an error listener alongside the drain listener so the Promise rejects and cleans up both listeners:

function writeStdoutAndFlush(data) {
  return new Promise((resolve, reject) => {
    const flushed = process.stdout.write(data);
    if (flushed) {
      resolve();
    } else {
      const onDrain = () => { process.stdout.removeListener("error", onError); resolve(); };
      const onError = (err) => { process.stdout.removeListener("drain", onDrain); reject(err); };
      process.stdout.once("drain", onDrain);
      process.stdout.once("error", onError);
    }
  });
}

Node.js exits on EPIPE for the default process.stdout pipe, so the practical risk is low in production. However if stdout is replaced with a custom Writable (e.g., in tests or future wrappers), the missing reject path causes an unrecoverable hang with no timeout.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — both onDrain and onError listeners now remove each other before settling so no leaks occur.

expect(resolved).toBe(true);
expect(stdoutChunks).toContain("large payload\n");

onceStub.mockRestore();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onceStub.mockRestore() is not in a try/finally block — if any expect(...) assertion above this line throws, the spy on process.stdout.once is never restored. Subsequent tests that call the real once (e.g., the formatResponse ordering test at line 568+) will operate on the still-mocked version and produce spurious failures or false passes. The same issue exists in the test at line ~616.

💡 Suggested fix

Wrap the test body in try/finally, or register cleanup via onTestFinished (Vitest ≥ 1.4):

const onceStub = vi.spyOn(process.stdout, "once").mockImplementation(...);
try {
  // ... all assertions ...
} finally {
  onceStub.mockRestore();
}

Alternatively, promote the spy setup to a nested beforeEach/afterEach pair so Vitest handles teardown automatically on failure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — both drain tests wrap the entire body (including assertions) in try/finally so mockRestore() is guaranteed to run.

@github-actions

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 85/100 — Excellent

Analyzed 6 test(s): 6 design, 0 implementation, 0 guideline violation(s). Three existing tests were converted to async, and three new tests cover the writeStdoutAndFlush backpressure fix.

📊 Metrics & Test Classification (6 tests analyzed)
Metric Value
New/modified tests analyzed 6
✅ Design tests (behavioral contracts) 6 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 5 (83%)
Duplicate test clusters 0
Test inflation detected YES (103/51 = 2.02, marginally over 2:1 — see note below)
🚨 Coding-guideline violations 0
Test File Classification Issues Detected
treats MCP result envelopes with isError=true as errors mcp_cli_bridge.test.cjs:175 ✅ Design
prints progress notifications to stderr and final text result to stdout for SSE responses mcp_cli_bridge.test.cjs:191 ✅ Design
prints numeric progress to stderr when progress notification has no message mcp_cli_bridge.test.cjs:205 ✅ Design
resolves immediately when stdout.write returns true (no backpressure) mcp_cli_bridge.test.cjs:525 ✅ Design
waits for drain event when stdout.write returns false (pipe buffer full) mcp_cli_bridge.test.cjs:533 ✅ Design
formatResponse awaits stdout drain before writing to stderr (no interleaving) mcp_cli_bridge.test.cjs:570 ✅ Design

Go: 0; JavaScript: 6 (*.test.cjs). Other languages detected but not scored.

i️ Score breakdown
Component Score Notes
Behavioral Coverage (40 pts) 40 6/6 tests verify observable behavior
Error/Edge Case Coverage (30 pts) 25 5/6 tests include an error path or edge case; resolves immediately covers the happy-path branch only
Low Duplication (20 pts) 20 No duplicate clusters detected
Proportional Growth (10 pts) 0 103 additions in test vs 51 in production = 2.02 — marginally triggers the >2:1 threshold; the extra lines are justified by complex async timing tests but the penalty applies per scoring rules

Total: 85/100

Verdict

Check passed. 0% implementation tests (threshold: 30%). All six tests verify behavioral contracts. The three async conversions ensure existing tests correctly await the now-async formatResponse, and the three new writeStdoutAndFlush tests cover: (1) the no-backpressure fast path, (2) the drain-wait path, and (3) a precise ordering regression test proving stdout flushes before stderr is written — directly encoding the invariant fixed by this PR.

🧪 Test quality analysis by Test Quality Sentinel ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 85/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 6 tests verify behavioral contracts with strong edge-case coverage.

Copilot AI and others added 2 commits June 18, 2026 12:46
…ore with try/finally, add EPIPE test

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@copilot run pr-finisher skill

Done. Here is the summary:

  • Reviews — All 7 unresolved github-actions[bot] threads addressed and replied to: added EPIPE error rejection + listener cleanup to writeStdoutAndFlush, wrapped both onceStub.mockRestore() calls in try/finally, added an EPIPE rejection test. The outdated lockfile thread was explained (intentional make recompile output).
  • Checks (local)make fmt, make lint, make build, make test-unit all pass. All 42 JS tests pass.
  • Checks (CI) — Stale after agent push; needs human re-trigger. Prior CI run showed unstable status.
  • Mergeable — PR is open, not draft, no conflicts.

Actions taken: Added EPIPE rejection path with listener cleanup to writeStdoutAndFlush, guarded mockRestore() calls with try/finally, added EPIPE test with explicit expect(errorCb).not.toBeNull() assertion.

Hand-off: CI must be re-triggered by a maintainer (close/reopen PR, workflow_dispatch, or a push) before merge.

@pelikhan pelikhan merged commit 2a7fd2b into main Jun 18, 2026
21 checks passed
@pelikhan pelikhan deleted the copilot/cli-tools-test-fix-json-output branch June 18, 2026 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants