Fix white-screen-on-login (cross-origin OIDC discovery), plus third-party OAuth sign-in and refresh diagnostics#11693
Merged
Merged
Conversation
Error reporting (Sentry/Rollbar) was only initialized inside AppRoot's useEffect, which runs after AppRoot commits. A render-time crash during the initial mount — e.g. the Brave white-screen-on-login — happens before that effect can run, so the SDKs were never set up and the crash was never reported. That's a large part of why this class of failure has been so hard to diagnose. Move initErrorReporting() into the boot sequence in packs/application.tsx, right after /client_configuration resolves (the earliest the DSN/token are available) and before the router is built or any React renders. The current user id isn't known yet at that point, so add ErrorReporting#setCurrentUser and have AppRoot call it once AppRootQuery resolves, rather than re-running init (which would install a duplicate set of global handlers and double-report). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…pps)
When signing in to a third-party OAuth relying app, the post-sign-in redirect
chain crosses origin into that app's callback (e.g.
larp-cantrip.herokuapp.com/users/auth/intercode/callback). DeviseSignInPage
submitted the login via fetch() with the default redirect: 'follow', so the
browser walked that chain as subresource (CORS-mode) requests. The cross-origin
hop into the relying app's callback was CORS-blocked — and the one-time
authorization code was burned in the process — leaving the user on the sign-in
page with "An error occurred. Please try again."
Make the post-sign-in redirect a top-level browser navigation instead, which is
not subject to CORS and works regardless of the relying app's headers:
- SessionsController#create responds to JSON requests with { location: ... }
and 200 instead of a 302. Navigational (no-JS) requests still redirect as
before. safe_sign_in_location applies the same trusted_origin? guard the
redirect used, so the JSON path can't become an open redirect.
- DeviseSignInPage sends Accept: application/json, reads the location, and
navigates to it top-level via window.location.href.
The failure path is unchanged: JSONFailureApp still returns { error: ... } JSON
on bad credentials, so inline error display keeps working. Added tests covering
all four cases.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
We can't yet explain why convention-site sessions drop overnight: the timing
points at the daily CleanupDbService cron, but the refresh-token rotation
machinery should let a well-behaved single browser survive it. Rather than
guess, record *why* /oauth_session/refresh returns invalid_grant so the next
occurrence gives us real data.
OAuthSessionsController#refresh now reports one of three reasons via
ErrorReporting (Sentry/Rollbar, filterable by an `oauth_refresh_failure` tag):
- cookie_absent — no refresh cookie was sent
- token_not_found — cookie carried a refresh token but no access token row
matches (the signature we'd expect if the nightly cleanup
pruned a row the cookie still referenced)
- grant_rejected — row exists but Doorkeeper refused the grant (already
revoked, or refresh-token reuse); logs the row's lifecycle
timestamps so we can tell rotation/races from deletion
No token material is logged — only the reason and safe metadata
(resource_owner_id, created/revoked/expires timestamps, previous-refresh-token
presence). Excludes the controller from Metrics/ClassLength (matching the
existing convention for application_controller.rb).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This is the actual white-screen-on-login bug. When an anonymous visitor hit a login-required page, useLoginRequired rendered nothing (`<></>`) while it kicked off initiateAuthentication in an effect. But initiating auth is async — it awaits OIDC discovery before it can build the redirect URL — so the page stayed blank for the whole round-trip. Worse, the .then had no .catch: if discovery failed or was blocked (e.g. by Brave's shields, or the `GET 0 /.well-known/openid-configuration` seen in a local repro), the redirect never happened, nothing surfaced the error, and the visitor was left staring at a permanent white screen with only a silent unhandled rejection. useLoginRequired now returns the element to render while not signed in — a loading indicator while redirecting, or an error with a Retry button if initiating auth fails (also reported via ErrorReporting) — or `false` once authenticated. It also guards against firing initiateAuthentication more than once per mount. The route guard and the inline login gates render that element instead of a blank fragment. This pairs with the earlier error-reporting hoist (so the failure is now captured) and the refresh instrumentation. Doesn't address *why* discovery is flaky — that's the remaining follow-up. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…in discovery Initiating login awaited an OIDC discovery fetch to the issuer host (/.well-known/openid-configuration). From a convention page that's a cross-origin request to the root site — which gets blocked (Brave shields in production; untrusted self-signed certs in local dev, where it showed up as `GET 0 /.well-known/openid-configuration`). When it fails, initiateAuthentication can't build the redirect URL and login wedges — the root of the white screen. The SPA only needs three things from the issuer: the issuer URL, the authorization endpoint (to build the redirect) and the end-session endpoint (for sign-out); token exchange/refresh already go through our own same-origin /oauth_session/* endpoints. So serve those in /client_configuration (already fetched same-origin at boot) and construct openid-client's Configuration directly, dropping the discovery() call entirely. No more cross-origin dependency in the login path. The endpoints are built by joining the issuer URL with the route paths, so a convention page gets the root-site endpoints regardless of which host served the request. Tests: openid.test.ts confirms the authorization URL and end-session endpoint come out of the metadata-built Configuration (no fetch); a client_configuration controller test confirms the endpoints are served on the issuer host. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
Code Coverage Report: Only Changed Files listed
Minimum allowed coverage is |
This was referenced Jun 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Follow-up to the OAuth-callback fix (#11686). This started as "add telemetry so we can finally see the white-screen-on-login," but along the way we reproduced it and found the root cause, so it now actually fixes it — plus a couple of other login problems that surfaced while digging.
The headline: an anonymous visitor hitting a login-required page got a blank screen because the login redirect depends on an OIDC discovery fetch to the issuer host, which is cross-origin (a convention page reaching the root site). When that request is blocked — Brave's shields in production, untrusted self-signed certs in local dev (where it showed up as
GET 0 /.well-known/openid-configuration) —initiateAuthenticationcan't build the redirect URL, and the guard rendered nothing while it waited, with no.catchto surface the failure. Permanent white screen, nothing reported.The fixes, roughly in order of impact:
/oauth_session/*. So we serve those three in/client_configuration(same-origin, already fetched at boot) and build openid-client'sConfigurationdirectly instead of callingdiscovery(). No cross-origin request in the login path anymore.useLoginRequirednow renders a spinner while it redirects, or an error + Retry if initiating auth fails (and reports it), instead of<></>.AppRooteffect that never runs when the initial render crashes — which is why this class of failure was invisible before.Two more login issues fixed along the way:
fetch()and followed the post-login redirect chain into the relying app's callback, which is cross-origin and got blocked — burning the one-time auth code in the process. Now the server returns the redirect location as JSON and the browser navigates top-level, which isn't subject to CORS and doesn't depend on the relying app's headers./oauth_session/refreshto record why it returnsinvalid_grant. Diagnostic only; we'll follow up once we have a real event.Changes
💻 Engineer-facing
/client_configurationnow returnsoidc_authorization_endpoint+oidc_end_session_endpoint(built on the issuer host); the SPA constructs the openid-clientConfigurationfrom them, with nodiscovery()call.useLoginRequiredreturns the element to render while signed out (spinner / error+retry) instead of a bare boolean; the route guard and inline gates render it. It reports failures and fires the redirect only once per mount.SessionsController#createresponds to JSON requests with{ location }(200) instead of a 302;DeviseSignInPagenavigates top-level.safe_sign_in_locationpreserves the open-redirect guard. Inline sign-in errors (JSONFailureApp) are unchanged.ErrorReporting#setCurrentUserattaches the user id onceAppRootQueryresolves.OAuthSessionsController#refreshreportscookie_absent/token_not_found/grant_rejectedviaErrorReporting(filterable tagoauth_refresh_failure); no token material is logged.Risks
createchange alters the JSON response shape (was a 302, now{ location }); the no-JS navigational path still redirects.Configurationfrom server-provided metadata instead of discovery means we rely on those three endpoints being correct in/client_configuration. There's a controller test, but worth a sanity check that login still redirects correctly in each environment.Testing
tsc --noEmit, eslint, and rubocop all pass. New tests:sessions+oauth_sessions+client_configurationcontroller tests, and vitest coverage foruseLoginRequired(signed-in / redirecting / failure paths) andopenid(Configuration built from metadata, no fetch). The white screen was also reproduced locally and confirmed to be the discoverystatus 0.Release plan and notes
🚢 — note the refresh instrumentation is diagnostic only; expect a small follow-up PR once an overnight
oauth_refresh_failureevent comes in to tell us why the refresh fails.🤖 Generated with Claude Code