Fail fast on invalid config regexes and enabled config#461
Fail fast on invalid config regexes and enabled config#461
Conversation
ChristianPavilonis
left a comment
There was a problem hiding this comment.
PR Review: "Fail fast on invalid config regexes and enabled config"
Verdict: Looks good — no blockers.
Findings
P2 — Medium (Informational)
-
RscUrlRewriternow uses a generic regex (crates/common/src/integrations/nextjs/shared.rs:22-27) — The static regex matches all URLs instead of per-origin URLs, so every URL in an RSC payload goes through the closure. The existing early-exit check (input.contains(&self.origin_host)at line 139) short-circuits the common case, so performance impact is negligible in practice. No action needed. -
Prebid
server_urlvalidation behavioral change (crates/common/src/integrations/prebid.rs:210-219) — Emptyserver_urlnow causes a hard error instead of silently disabling. This is the stated intent of the PR and is covered by the newempty_prebid_server_url_fails_orchestrator_startuptest. Acceptable.
Highlights
is_explicitly_disabledshort-circuit — Inspects raw JSON before deserialization soenabled: falseconfigs with otherwise invalid fields don't cause spurious startup failures. Smart design, well-tested.- Comprehensive test coverage — Invalid handler regexes (TOML + env var), disabled+invalid config short-circuits for integrations/providers/auction providers, enabled+invalid startup failures,
__NEXT_DATA__rewrite fixtures with ports/metacharacters/whitespace/hostname boundaries. - Consistent fallible builder pattern — Every integration's
build()uniformly returnsResult<Option<T>, Report<TrustedServerError>>, making the codebase very predictable. - Lockr regex moved to
Lazy<Regex>static (crates/common/src/integrations/lockr.rs:37-42) — Avoids recompiling on every request. strip_origin_host_with_optional_portboundary helper (crates/common/src/integrations/nextjs/shared.rs:67-100) — Properly preventsorigin.example.com.evilfrom matching while allowingorigin.example.com:8443/path. Edge cases handled correctly and tested.
All CI checks pass. High-quality hardening PR with thorough, consistent changes across all 25 files.
aram356
left a comment
There was a problem hiding this comment.
Summary
Solid defensive hardening that converts config-derived regex compilation and integration registration from panic/log-and-swallow to Result-based propagation with fail-fast at startup. The is_explicitly_disabled short-circuit is a well-targeted addition. Consistent application across all 10+ integrations with thorough test coverage.
Non-blocking
🤔 thinking
- Inconsistent parameter storage across rewriters:
RscUrlRewriterstoresorigin_host,UrlRewriteris stateless — sibling types with different conventions (shared.rs:118,script_rewriter.rs:113) restrict_accept_encodingunconditionally overwrites client preferences: prevents unsupported encodings from origin but does not honor clients that only acceptidentityor a subset (publisher.rs:19)
♻️ refactor
rewrite_nextjs_values_with_rewriteris a pass-through wrapper: delegates torewriter.rewrite_embedded()with no added logic — can be inlined (script_rewriter.rs:74)validate_pathcompiles regex redundantly:from_tomlnow callsprepare_runtime()which compiles viaOnceLock, making the validation-time compile redundant (settings.rs:473)- Redundant
prepare_runtime()call inget_settings:from_tomlalready calls it internally (settings_data.rs:37)
🌱 seedling
once_cell::sync::Lazy→std::sync::LazyLock: project uses Rust 2024 edition whereLazyLockis stable — opportunity to drop theonce_celldep in a follow-upstrip_origin_host_with_optional_portdoes not handle IPv6:RSC_URL_PATTERNcaptures bracketed IPv6 but this helper cannot match them — worth a doc comment (shared.rs:67)
👍 praise
is_explicitly_disabledshort-circuit: checking raw JSON before deserialization means disabled integrations with placeholder config do not blow up startup (settings.rs:123)- Excellent hardening test matrix: disabled-invalid-skips, enabled-invalid-fails, provider configs, registry/orchestrator startup, handler regex, env var overrides, and a real-world-shaped
__NEXT_DATA__fixture - Consistent
Result-based registration: uniform pattern across all integrations with no stragglers
CI Status
- cargo fmt: PASS
- cargo test: PASS
- vitest: PASS
- CodeQL/Analyze: PASS
- format-docs: PASS
- format-typescript: PASS
There was a problem hiding this comment.
Review: Approve
Systematically converts all integration/provider registration from "log-and-skip" to "fail-fast" error propagation, eagerly validates config-derived regexes at startup, replaces once_cell with std::sync::LazyLock, and adds comprehensive regression tests. The design is sound: invalid enabled config fails startup, disabled invalid config is silently skipped via is_explicitly_disabled, and handler regexes are eagerly prepared.
No blocking issues.
Findings
P2 — RscUrlRewriter regex now matches all URL-like patterns, not just origin
shared.rs:23-28 — The old code compiled a per-instance regex with the origin host baked in (escape(origin_host) in the pattern), so the regex engine only ever fired the callback for origin URLs. The new static RSC_URL_PATTERN matches any URL-like pattern and the callback checks strip_origin_host_with_optional_port to decide whether to rewrite. Rewriting output is identical, but non-origin URLs now pay a callback invocation + .to_string() allocation before being returned unchanged. Mitigated by the !input.contains(origin_host) fast-path at line 137 and the fact that per-instance regex compilation was the larger cost. Net performance win for the common case — just noting the tradeoff.
P2 — Cell<bool> in script rewriter
script_rewriter.rs:166 — Tracking mutation inside replace_all via Cell<bool>. Could be simplified to checking matches!(result, Cow::Borrowed(_)) since replace_all returns Cow::Borrowed when the callback always returns the original text. Current approach is correct for single-threaded wasm32-wasip1 — minor style nit.
Highlights
- Excellent regression test coverage: disabled invalid config bypass, enabled invalid config failure, handler regex validation,
__NEXT_DATA__fixture rewrites, explicit port URL handling, publisher encoding fallback is_explicitly_disabledearly exit before deserialization/validation is a thoughtful design- Consistent
Ok(None)/Ok(Some(...))/Err(...)pattern across all integrations RscUrlRewritersimplification from per-instance compiled regex to zero-sized type with sharedLazyLock<Regex>eliminates per-request regex compilationrestrict_accept_encodingprevents the origin from responding with encodings the rewrite pipeline can't handle
Summary
TrustedServerErrorresponses.__NEXT_DATA__rewrites, and publisher fallback encoding behavior.Changes
.claude/agents/pr-creator.md.claude/agents/pr-reviewer.mdcrates/common/src/auction/mod.rscrates/common/src/auth.rscrates/common/src/integrations/adserver_mock.rscrates/common/src/integrations/aps.rscrates/common/src/integrations/datadome.rscrates/common/src/integrations/didomi.rscrates/common/src/integrations/google_tag_manager.rscrates/common/src/integrations/gpt.rscrates/common/src/integrations/lockr.rscrates/common/src/integrations/mod.rscrates/common/src/integrations/nextjs/html_post_process.rscrates/common/src/integrations/nextjs/mod.rs__NEXT_DATA__coverage.crates/common/src/integrations/nextjs/rsc.rscrates/common/src/integrations/nextjs/script_rewriter.rsexpect()regex compilation with fallible rewriters and add hostname, port, whitespace, and metacharacter regressions.crates/common/src/integrations/nextjs/shared.rscrates/common/src/integrations/permutive.rscrates/common/src/integrations/prebid.rsserver_url, and return fallible provider builders.crates/common/src/integrations/registry.rscrates/common/src/integrations/testlight.rscrates/common/src/publisher.rsAccept-Encodingto codecs the rewrite pipeline supports and add regression coverage.crates/common/src/settings.rsenabled = falseconfigs, and add startup hardening regressions.crates/common/src/settings_data.rscrates/fastly/src/main.rsHardening note
Invalid
handlers[].pathregexes now fail during startup preparation and still return descriptive configuration errors if request-time code encounters unprepared settings. Enabled integrations and auction providers now fail registry/orchestrator startup instead of logging and disabling themselves, while rawenabled = falseconfigs still short-circuit before validation. Regression coverage includes invalid handler TOML and env overrides, disabled-invalid config bypasses, enabled-invalid integration/provider startup failures, emptyprebid.server_url, Next.js__NEXT_DATA__and RSC hostname/port/metacharacter rewrites, and the publisher fallback encoding path.Closes
Closes #403
Test plan
cargo test --workspacecargo clippy --all-targets --all-features -- -D warningscargo fmt --all -- --checkcd crates/js/lib && npx vitest runcd crates/js/lib && npm run formatcd docs && npm run formatcargo build --bin trusted-server-fastly --release --target wasm32-wasip1fastly compute servenpx vitest runwas skipped per explicit user instruction because of the unrelated repo-wideERR_REQUIRE_ESMfailure inhtml-encoding-sniffer.Checklist
unwrap()in production code — useexpect(\"should ...\")tracingmacros (notprintln!)