Conversation
aram356
left a comment
There was a problem hiding this comment.
Summary
Solid security hardening — removing allow-scripts/allow-same-origin from the sandbox, rejecting dangerous creatives rather than silently sanitizing, and fixing String.replace $-sequence injection. However, slot clearing before validation creates a regression where rejected bids blank the slot, and the URI detection has gaps that could cause both false negatives (data:image/svg+xml) and false positives (benign URI attribute removals).
Blocking
🔧 wrench
- Slot blanking on rejection:
container.innerHTML = ''runs before sanitization — rejected creatives destroy existing slot content. In multi-bid scenarios, a later rejected bid erases an earlier successful render. (request.ts:96) - Missing
bid.admguard: Thebid.adm &&check frommainwas removed, so bids with missing/empty/malformedadmenter the render path, clear the slot, then get rejected. (request.ts:51) - Narrow
data:URI pattern: Only blocksdata:text/html, missingdata:text/xml,data:application/xhtml+xml, anddata:image/svg+xml(SVG can embed<script>). (render.ts:35) - Over-aggressive URI attribute flagging:
isDangerousRemovalflags any removed URI attribute as dangerous regardless of value, causing false rejections for benign creatives. Inconsistent withhasDangerousMarkupwhich correctly checks the value. (render.ts:108)
Non-blocking
🤔 thinking
- 3.8x bundle size increase: DOMPurify is statically imported into
tsjs-core.js(8,964 B → 34,160 B raw, 3,788 B → 12,940 B gzip). The build usesinlineDynamicImports: trueso lazyimport()won't help. Since the policy is reject-only,hasDangerousMarkup(native<template>parser) already does the full detection. Consider removing DOMPurify entirely or moving sanitization server-side. - Static-only creative contract without rollout guard: Removing
allow-scripts+allow-same-originand rejecting script-bearing markup is a major behavioral shift. Most DSP creatives use JavaScript for tracking, viewability, and click handling. Consider a strict-render feature flag (default off) with rejection metrics, rolled out by seat/publisher.
♻️ refactor
- Inconsistent sandbox policy:
<form>is inDANGEROUS_TAG_NAMES(rejected) butallow-formsis inCREATIVE_SANDBOX_TOKENS(permitted). Removeallow-formsor stop rejecting<form>. (render.ts:38) hasDangerousMarkuplacks intent documentation: The post-sanitization re-scan is a valid safety net for sanitizer bugs, but the comment doesn't explain why DOMPurify output is being re-scanned. (render.ts:119)
⛏ nitpick
srcdocinURI_ATTRIBUTE_NAMES:srcdocis HTML content, not a URI. DOMPurify already strips it. (render.ts:33)
🌱 seedling
- Missing test coverage: (1) multi-bid same slot where one bid is rejected, (2) sanitizer-unavailable path, (3)
data:image/svg+xmlwith embedded script, (4) explicit test documenting script-based creatives are intentionally rejected.
👍 praise
buildCreativeDocument$-sequence fix: Function callbacks inString.replaceprevent replacement pattern injection. Well-tested. (render.ts:337)- Structured rejection logging: Rejection logs include metadata without leaking raw creative HTML. Tests verify no raw HTML in log output. (
request.ts:100)
CI Status
- cargo fmt: PASS
- cargo test: PASS
- vitest: PASS
- format-typescript: PASS
- format-docs: PASS
- CodeQL: PASS
ChristianPavilonis
left a comment
There was a problem hiding this comment.
No new findings in this review pass.
aram356
left a comment
There was a problem hiding this comment.
Summary
Adds server-side HTML sanitization for inline ad creatives via lol_html, tightens the client-side iframe sandbox, and improves render-path logging. The sanitizer architecture (server strips dangerous markup, client validates type/emptiness, sandbox enforces no-script) is sound, but the fail-open fallback paths and unsanitized <style> content need to be addressed before merge.
Blocking
🔧 wrench
- Fail-open on oversized input:
sanitize_creative_htmlreturns raw markup when input exceedsMAX_CREATIVE_SIZE— should return empty string (creative.rs:355) - Fail-open on parse errors: Raw markup returned when lol_html fails to parse — should return empty string (
creative.rs:464) <style>element content not sanitized: Inlinestyleattributes are checked but<style>blocks pass through withexpression(),@import, etc. (creative.rs:402)
❓ question
- Is preserving
<style>elements intentional?:<link>is stripped but<style>is allowed — inconsistent treatment (creative.rs:393)
Non-blocking
🤔 thinking
- Proxy path skips sanitization:
CreativeHtmlProcessor(inproxy.rs) only runs throughrewrite_creative_html, notsanitize_creative_html. Probably intentional since proxied pages may legitimately need scripts/iframes, but worth documenting the trust boundary difference. removedCountalways 0 on client: Client-side sanitization fields are always identity/zero, could mislead operators (render.ts:71)
♻️ refactor
data-srcandsrcsetnot checked: Missing from the URI attribute check list for defense-in-depth (creative.rs:413)
🌱 seedling
- Missing sanitizer + rewriter integration test: No test runs both in sequence as
formats.rsdoes
📝 note
- Sandbox removes
allow-scripts: Deliberate defense-in-depth but will break creatives relying on inline JS for click tracking, viewability, or animation — worth validating with real-world ad creatives
⛏ nitpick
unwrap_or("")on infallible split:.split().next()can never beNone(creative.rs:323)
CI Status
All 10 checks pass: cargo fmt, cargo test, vitest, format-typescript, format-docs, CodeQL, Analyze (actions, javascript-typescript x2, rust).
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Summary
Well-executed security hardening PR. The defense-in-depth architecture is sound — server-side lol_html sanitization, client-side type/emptiness validation, tightened iframe sandbox. All CI passes. Previous review feedback fully addressed. Two minor findings below.
PR description is stale: The summary table still references crates/js/lib/package.json and crates/js/lib/package-lock.json with DOMPurify changes, but these files aren't in the diff. The current architecture (server-side lol_html, no client-side DOMPurify) is better — please update the description to match.
| // discards the tag — but the handler still fires. This is intentional and | ||
| // harmless. | ||
| element!("*", |el| { | ||
| let on_attrs: Vec<String> = el |
There was a problem hiding this comment.
🔧 Consider stripping <noscript>: Consider adding <noscript> to the stripped elements list. In the no-script sandbox (allow-scripts removed), browsers render <noscript> content. While not a direct bypass — URL rewriting catches tracking pixels downstream — stripping it is low-effort defense-in-depth against parser differential attacks.
| || url.starts_with("vbscript:") | ||
| || (url.starts_with("data:") && !is_safe_data_uri(&url)) | ||
| }); | ||
| if is_dangerous { |
There was a problem hiding this comment.
🤔 Document fail-closed srcset stripping: Worth a brief comment here documenting the intentional choice to strip the entire srcset attribute (losing safe entries) rather than selectively filtering individual entries. The current fail-closed behavior is correct — a mixed safe/dangerous srcset likely indicates a malicious creative — but the reasoning isn't obvious to future readers.
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Review Summary
Excellent security hardening PR with textbook fail-closed design. A few observations worth considering — none are blockers.
Highlights:
- Textbook fail-closed design on every error path — parse failures, oversized inputs, and non-UTF-8 output all return empty string rather than passing through unsanitized markup
- Three-layer defense: server-side sanitization → client-side type/emptiness validation → sandboxed iframe with minimal permissions
- The
$-pattern injection fix inbuildCreativeDocument(using() => replacementcallbacks) is subtle and well-tested - Comprehensive test coverage: 30+ sanitizer tests in Rust covering element removal, attribute stripping, URI scheme filtering, srcset parsing, and size limits; full JS coverage for the client-side validation and rendering pipeline
- Structured logging without raw HTML prevents log injection —
originalLength/sanitizedLength/removedCountfields give observability without leaking creative content
| type RenderCreativeInlineOptions = { | ||
| slotId: string; | ||
| // Accept unknown input here because bidder JSON is untrusted at runtime. | ||
| creativeHtml: unknown; |
There was a problem hiding this comment.
🔧 P1 — parseAuctionResponse coerces adm to empty string, making unknown typing unreachable via normal flow
parseAuctionResponse does adm: b.adm ?? '', which means if the server sends a bid with no adm field (or adm: null), the parsed bid will have adm: '' (a string). The guard in request.ts (if (!bid.adm)) tests for falsiness — empty string is falsy, so this particular case works.
However, creativeHtml is typed as unknown here explicitly because "bidder JSON is untrusted at runtime", yet the AuctionBid interface types adm as string. This means sanitizeCreativeHtml(creativeHtml) will never actually receive a non-string value through the normal auction flow — the typeof !== 'string' check in sanitizeCreativeHtml is unreachable for the standard code path.
The defense-in-depth is valuable, but consider either:
- Removing the
?? ''coercion in the auction parser so client-side validation can catchnull/undefined/non-stringadmvalues in real traffic, or - Adding a clarifying comment in the parser acknowledging the type coercion and that the client-side
unknownguard is a second line of defense for non-standard callers
| // srcset attribute is removed rather than filtering individual entries. | ||
| // A mixed safe/dangerous srcset is a strong indicator of a malicious | ||
| // creative, so dropping the whole attribute is the correct response. | ||
| if let Some(val) = el.get_attribute("srcset") { |
There was a problem hiding this comment.
🔧 P2 — Missing imagesrcset handling in sanitizer
The sanitizer checks srcset for dangerous URLs but does not check imagesrcset (used on <link> and <source> elements). The URL rewriter already handles imagesrcset, which means a creative could contain an element with imagesrcset="javascript:alert(1) 1x" that passes through the sanitizer unchecked.
While <link> elements are stripped by the sanitizer (so that vector is covered), imagesrcset can appear on <source> and potentially other elements. Since <source> is not in the removal list, this is a gap.
Practical risk is low given the sandbox restrictions, but for completeness and defense-in-depth, imagesrcset should get the same treatment as srcset.
Suggested fix: duplicate the srcset check block for imagesrcset:
if let Some(val) = el.get_attribute("imagesrcset") {
let is_dangerous = val.split(',').any(|entry| {
let url = entry
.trim()
.split_ascii_whitespace()
.next()
.unwrap_or("")
.to_ascii_lowercase();
url.starts_with("javascript:")
|| url.starts_with("vbscript:")
|| (url.starts_with("data:") && !is_safe_data_uri(&url))
});
if is_dangerous {
el.remove_attribute("imagesrcset");
}
}| HtmlSettings { | ||
| element_content_handlers: vec![ | ||
| // Remove executable/dangerous elements along with their inner content. | ||
| element!("script", |el| { |
There was a problem hiding this comment.
🤔 P2 — 10 identical element-removal handlers could be consolidated
Ten nearly identical element!("tag", |el| { el.remove(); Ok(()) }) handlers are defined here (lines 364–409). lol_html supports comma-separated selectors in a single element! macro call.
This could be consolidated into a single handler:
element!("script, iframe, object, embed, base, meta, form, link, style, noscript", |el| {
el.remove();
Ok(())
})This would reduce ~50 lines to ~5 and make it easier to maintain the blocklist (adding or removing a tag is a single edit instead of a copy-paste block).
| // subtypes can carry HTML/JS payloads inside CSS url() values). | ||
| if let Some(style) = el.get_attribute("style") { | ||
| let lower = style.to_ascii_lowercase(); | ||
| if lower.contains("expression(") |
There was a problem hiding this comment.
🤔 P2 — Style attribute sanitization uses simple substring matching — CSS escapes could bypass
The inline style attribute check uses simple substring matching (lower.contains("expression("), lower.contains("javascript:"), etc.). CSS allows various forms of obfuscation that could bypass these checks:
- CSS escape sequences:
\65xpression(→expression( - CSS comments:
expr/**/ession( - Null bytes between characters
Since the iframe sandbox removes allow-scripts and allow-same-origin, CSS expression attacks won't execute in practice (they only work in IE6–8 anyway). The current approach is reasonable for the threat model.
Consider either:
- Stripping all
styleattributes entirely for maximum strictness, or - Adding a
// NOTE:comment documenting that CSS-escape bypass is mitigated by the sandbox, so a future reviewer is aware of this trade-off
The current defense is adequate given the sandbox — this is just about making the design rationale explicit.
|
|
||
| export type CreativeSanitizationRejectionReason = 'empty-after-sanitize' | 'invalid-creative-html'; | ||
|
|
||
| export type AcceptedCreativeHtml = { |
There was a problem hiding this comment.
🤔 P2 — sanitizeCreativeHtml result types have misleading fields for client-side usage
AcceptedCreativeHtml and RejectedCreativeHtml include sanitizedLength and removedCount fields, but the client-side sanitizeCreativeHtml function never actually sanitizes content — it only validates type and emptiness. For accepted creatives, sanitizedLength === originalLength and removedCount === 0 always.
The comment at line 40–44 does a good job explaining this, but the fields still add noise to logging output (every render log line includes sanitizedLength and removedCount that carry no signal).
Consider either:
- Simplifying the client-side types to only include fields that carry signal (e.g.,
originalLengthandrejectionReason), or - Renaming the function/type to
validateCreativeHtml/ValidateCreativeHtmlResultto make the distinction from server-side sanitization clearer
| import NORMALIZE_CSS from './styles/normalize.css?inline'; | ||
| import IFRAME_TEMPLATE from './templates/iframe.html?raw'; | ||
|
|
||
| const CREATIVE_SANDBOX_TOKENS = [ |
There was a problem hiding this comment.
🤔 P2 — allow-forms silently removed from sandbox
The previous sandbox included allow-forms. The PR description mentions removing allow-scripts and allow-same-origin, but allow-forms was also removed without being called out.
Removing allow-forms could break ad creatives that use forms for lead generation (e.g., email signup forms within ad units). The server-side sanitizer already strips <form> elements, so this is consistent — but it's a notable behavioral change that should be documented.
Either:
- Add
allow-formsback if form submission from creatives is a valid use case for your publishers, or - Explicitly document the
allow-formsremoval in the PR description and confirm it's intentional (since the server strips<form>anyway, this is likely fine — just worth calling out)
Summary
lol_html: strip executable elements (script,iframe,object,embed,base,meta,form,link,style,noscript),on*event-handler attributes, and dangerous URI schemes (javascript:,vbscript:, unsafedata:URIs includingdata:image/svg+xml).allow-scriptsandallow-same-origin, keeping onlyallow-popups,allow-popups-to-escape-sandbox, andallow-top-navigation-by-user-activation.bid.admbefore rendering; reject and log malformed or empty creatives with structured metadata, never logging raw creative HTML.Changes
crates/common/src/creative.rssanitize_creative_html: lol_html-based server-side sanitizer that strips dangerous elements and attributes. Fail closed on oversized input and parse errors.crates/common/src/auction/formats.rssanitize_creative_htmlbeforerewrite_creative_htmlso URL rewriting only ever sees clean markup.crates/js/lib/src/core/render.tssanitizeCreativeHtmlto type/emptiness validation only (server handles sanitization). Tighten iframe sandbox — removeallow-scriptsandallow-same-origin.crates/js/lib/src/core/request.tsbid.adm. Clear slot only after sanitization succeeds so rejected creatives never blank existing content. Add structured rejection logging.crates/js/lib/test/core/render.test.tsbuildCreativeDocumenttemplate injection.crates/js/lib/test/core/request.test.tsCloses
Closes #401
Test plan
cargo test --workspacecargo clippy --all-targets --all-features -- -D warningscargo fmt --all -- --checkcd crates/js/lib && npx vitest runcd crates/js/lib && npm run formatcd docs && npm run formatcargo build --bin trusted-server-fastly --release --target wasm32-wasip1fastly compute servecd crates/js/lib && npm run buildChecklist
unwrap()in production code — useexpect("should ...")logmacros (notprintln!)