Skip to content

[miniflare] Fix Browser Run Windows race condition#13734

Draft
petebacondarwin wants to merge 1 commit intomainfrom
pbd/miniflare/browser-rendering-windows-flake
Draft

[miniflare] Fix Browser Run Windows race condition#13734
petebacondarwin wants to merge 1 commit intomainfrom
pbd/miniflare/browser-rendering-windows-flake

Conversation

@petebacondarwin
Copy link
Copy Markdown
Contributor

@petebacondarwin petebacondarwin commented Apr 29, 2026

Fixes the flaky Tests (Windows, packages-and-tools) CI failures observed in run 25108579707/job 73576107174, where 11 of 21 tests in packages/miniflare/test/plugins/browser/index.spec.ts failed even with { retry: 3 }.

When Miniflare launches Chrome for Browser Run bindings, it returns the WebSocket endpoint as soon as Chrome prints its DevTools listening on ws://... banner. On Windows the underlying listening socket is occasionally not yet accepting connections at that point, so the first request from workerd to Chrome fails with kj/async-io-win32.c++:281: failed: ConnectEx(): #1225 The remote computer refused the network connection.. The error propagates up through /v1/acquire as the response body Error: The remote computer refused…, which the user worker then fails to JSON-parse — the surface the test sees is SyntaxError: Unexpected token 'E', "Error: The"... is not valid JSON.

This PR addresses the issue at two layers:

  1. Chrome readiness probe in launchBrowser (packages/miniflare/src/plugins/browser-rendering/index.ts). After waitForLineOutput resolves with the WS endpoint, probe Chrome's /json/version HTTP endpoint with retry/backoff (25ms → 250ms, ~5s total) until it accepts a connection. This addresses the root cause of ConnectEx (#1225).
  2. Defense-in-depth retry inside the binding worker (packages/miniflare/src/workers/browser-rendering/binding.worker.ts). A new fetchWithConnectRetry helper retries on substring-matched transient workerd/kj errors (connection refused, remote computer refused, network name is no longer available, network connection lost, disconnected) and is applied to setSessionInfoRoute, #proxyRawWebSocket, and #proxyJsonRequest. This catches WSARecv (#64) and similar mid-fetch socket failures that the readiness probe alone cannot prevent.

The fix is internal — there are no API or behaviour changes for users beyond the elimination of the spurious connection errors. On macOS the probe succeeds on the first attempt, adding ~no measurable overhead (browser test suite still completes in ~9–10s locally over 4 consecutive runs).

  • Tests

    • Tests included/updated
    • Automated tests not possible - manual testing has been completed as follows:
    • Additional testing not necessary because: covered by current tests
  • Public documentation

    • Cloudflare docs PR(s):
    • Documentation not necessary because: internal flake fix; user-visible impact is just that browser rendering bindings no longer fail with spurious connection errors during startup on Windows.

Additional testing:

Ran pnpm -F miniflare test test/plugins/browser/index.spec.ts four consecutive times locally on macOS — all 21 tests pass each time. The failure mode being fixed is Windows-specific and not reproducible on macOS/Linux without artificial fault injection.


Open in Devin Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 29, 2026

🦋 Changeset detected

Latest commit: 56293c3

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-project-automation github-project-automation Bot moved this to Untriaged in workers-sdk Apr 29, 2026
@workers-devprod workers-devprod requested review from a team and NuroDev and removed request for a team April 29, 2026 16:19
@workers-devprod
Copy link
Copy Markdown
Contributor

workers-devprod commented Apr 29, 2026

Codeowners approval required for this PR:

  • ✅ @cloudflare/wrangler
Show detailed file reviewers

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

✅ All changesets look good

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented Apr 29, 2026

I've now thoroughly reviewed all three changed files. Let me analyze the code for any actionable issues:

  1. waitForBrowserReady (index.ts:251-290): The retry logic with exponential backoff is correct. The deadline-based loop, per-request timeout via AbortSignal.timeout, body draining, and error propagation via cause all look good. The URL construction on line 259 correctly converts ws:// to http:// before parsing.

  2. fetchWithConnectRetry (binding.worker.ts:83-106): The retry logic is sound. The lastError variable on line 92 and the throw lastError on line 105 caught my eye — lastError could theoretically be undefined if maxAttempts is 0, but the default is 5 and the loop structure guarantees at least one iteration for any positive maxAttempts. On the last attempt (attempt === maxAttempts - 1), the error is rethrown immediately at line 99, so the throw lastError at line 105 is effectively unreachable dead code. This is not a bug, just defensive coding.

  3. isRetryableFetchError (binding.worker.ts:62-71): The use of as cast is pragmatic for extracting the message from an unknown error type. The substring matching approach with case-insensitive comparison is appropriate for matching kj/workerd error messages.

  4. RETRYABLE_FETCH_ERROR_SUBSTRINGS (binding.worker.ts:49-60): The "disconnected" substring is quite broad — but in context, this only triggers when fetch throws an error (not on HTTP error responses), and a fetch throw with "disconnected" in the message is indeed a transient connection failure. The breadth is acceptable for this defense-in-depth layer.

  5. Changeset: Correctly scoped to miniflare as patch. Description is clear.

  6. Integration points: The three call sites in BrowserSession (setSessionInfoRoute, #proxyRawWebSocket, #proxyJsonRequest) all correctly replace fetch with fetchWithConnectRetry with matching signatures. No call sites were missed — there are no other fetch calls to Chrome in this file (the loopback fetches to localhost are correctly left as-is since those go to workerd, not Chrome).

Everything checks out. The two-layer defense (probe + retry) is well-reasoned, the implementation is correct, and the change is appropriately scoped.

LGTM

github run

@petebacondarwin petebacondarwin added the browser Issues to do with browser run/rendering label Apr 29, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 29, 2026

create-cloudflare

npm i https://pkg.pr.new/create-cloudflare@13734

@cloudflare/kv-asset-handler

npm i https://pkg.pr.new/@cloudflare/kv-asset-handler@13734

miniflare

npm i https://pkg.pr.new/miniflare@13734

@cloudflare/pages-shared

npm i https://pkg.pr.new/@cloudflare/pages-shared@13734

@cloudflare/unenv-preset

npm i https://pkg.pr.new/@cloudflare/unenv-preset@13734

@cloudflare/vite-plugin

npm i https://pkg.pr.new/@cloudflare/vite-plugin@13734

@cloudflare/vitest-pool-workers

npm i https://pkg.pr.new/@cloudflare/vitest-pool-workers@13734

@cloudflare/workers-editor-shared

npm i https://pkg.pr.new/@cloudflare/workers-editor-shared@13734

@cloudflare/workers-utils

npm i https://pkg.pr.new/@cloudflare/workers-utils@13734

wrangler

npm i https://pkg.pr.new/wrangler@13734

commit: 56293c3

@petebacondarwin petebacondarwin force-pushed the pbd/miniflare/browser-rendering-windows-flake branch from 0d5e987 to 0719495 Compare April 29, 2026 16:30
@ruifigueira
Copy link
Copy Markdown
Contributor

lgtm, but we should replace Browser Rendering with Browser Run.

@petebacondarwin petebacondarwin force-pushed the pbd/miniflare/browser-rendering-windows-flake branch from 0719495 to 69cda5f Compare April 29, 2026 16:56
devin-ai-integration[bot]

This comment was marked as resolved.

@petebacondarwin petebacondarwin force-pushed the pbd/miniflare/browser-rendering-windows-flake branch from 69cda5f to 3910c49 Compare April 29, 2026 17:03
@petebacondarwin petebacondarwin changed the title [miniflare] Fix browser rendering Windows race condition [miniflare] Fix Browser Run Windows race condition Apr 29, 2026
@petebacondarwin petebacondarwin force-pushed the pbd/miniflare/browser-rendering-windows-flake branch from 3910c49 to 7a370f0 Compare April 29, 2026 17:09
Comment thread .vscode/settings.json Outdated
Comment on lines +110 to +111
"oxc.path.oxfmt": "node_modules/.bin/oxfmt",
"oxc.path.oxlint": "node_modules/.bin/oxlint",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not the default?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't get format-on-save to work in VS Code because the extension couldn't find the binaries.
And this was a suggested workaround.
Not really relevant to this.

Copy link
Copy Markdown
Contributor

@workers-devprod workers-devprod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codeowners reviews satisfied

Wait for Chrome's DevTools port to actually accept connections before
returning from launchBrowser, and retry transient connection failures
when the binding worker reaches Chrome. Fixes flaky ConnectEx (#1225)
and WSARecv (#64) errors observed when running miniflare browser
rendering tests on Windows CI.
@petebacondarwin petebacondarwin force-pushed the pbd/miniflare/browser-rendering-windows-flake branch from 7a370f0 to 56293c3 Compare April 29, 2026 18:44
@petebacondarwin petebacondarwin marked this pull request as draft April 29, 2026 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

browser Issues to do with browser run/rendering

Projects

Status: Approved

Development

Successfully merging this pull request may close these issues.

4 participants