feat(moq-relay): on-demand key resolution via --auth-keys#1188
Conversation
Replace the single-key auth model with on-demand key resolution by key
ID (kid). Keys are now fetched individually from a directory or URL when
a client connects, rather than loaded at startup.
- --auth-keys accepts a directory path (reads {dir}/{kid}.jwk) or base
URL (fetches {url}/{kid}.jwk with HTTP caching)
- Removes --auth-key and --auth-refresh-interval
- verify() is now async to support URL-based key fetching
- URL mode respects Cache-Control, stale-while-revalidate, and ETag
- kid validated to prevent path traversal
- Updated dev configs, CDN deployment, and docs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAuthentication verification was reworked to an async 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
rs/moq-relay/Cargo.toml (1)
44-44: AlignreqwestTLS backend with workspace convention (explicitly use rustls).Line 44 leaves
reqwestas"0.13"without explicit TLS backend selection. Sincemoq-relayexplicitly depends onrustls(line 45) and the workspace already follows this pattern inmoq-mux(which declaresreqwestwithrustls-tlsfeature), making the backend explicit here improves consistency and clarity.Suggested dependency update
-reqwest = "0.13" +reqwest = { version = "0.13", default-features = false, features = ["rustls-tls"] }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rs/moq-relay/Cargo.toml` at line 44, The reqwest dependency in the moq-relay manifest is declared without an explicit TLS backend; update the reqwest entry to enable the rustls TLS backend by declaring reqwest with the "rustls-tls" feature (i.e., change the version-only declaration of reqwest to a feature-enabled declaration including "rustls-tls") so it matches the workspace convention used by moq-mux; also review the existing explicit rustls dependency (if present) and remove or reconcile it if now redundant.rs/moq-relay/src/auth.rs (2)
215-219: File mode reads from disk on every request (no in-memory caching).Unlike URL mode which caches resolved keys, file mode calls
Key::from_fileon every verification. For high-throughput scenarios, this could become a performance bottleneck, though OS-level file caching will mitigate this.If this becomes an issue, consider adding optional file-mode caching with filesystem watch (e.g.,
notifycrate) for invalidation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rs/moq-relay/src/auth.rs` around lines 215 - 219, The resolver function resolve_file currently calls Key::from_file(&path) on every verification causing repeated disk reads; change resolve_file to consult a shared in-memory cache (e.g., a HashMap<PathBuf, Arc<Key>> behind a RwLock or dashmap) keyed by path/kid and return the cached Arc<Key> when present, and only call Key::from_file and insert into the cache on miss; also add an optional invalidation mechanism (file mtime check or use the notify crate to watch the directory) to evict/update entries when the underlying .jwk file changes so Key::from_file isn’t repeatedly invoked for every request.
160-171: Consider using a more specific error variant for invalid kid characters.The function returns
AuthError::MissingKeyIdfor both empty and invalid characters. While this works, a more descriptive error likeInvalidKeyIdwould better distinguish between "no kid in header" vs "kid contains disallowed characters" for debugging purposes.💡 Suggested improvement
#[error("missing key ID in token")] MissingKeyId, + +#[error("invalid key ID format")] +InvalidKeyId,Then in
validate_kid:fn validate_kid(kid: &str) -> Result<(), AuthError> { if kid.is_empty() { return Err(AuthError::MissingKeyId); } if kid.chars().all(|c| c.is_ascii_alphanumeric() || c == '-' || c == '_') { Ok(()) } else { - Err(AuthError::MissingKeyId) + Err(AuthError::InvalidKeyId) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rs/moq-relay/src/auth.rs` around lines 160 - 171, The validate_kid function conflates empty kid and invalid-character cases by returning AuthError::MissingKeyId for both; add a new AuthError::InvalidKeyId variant to the AuthError enum, update validate_kid to return MissingKeyId only when kid.is_empty() and return InvalidKeyId when characters are disallowed, and then adjust any matching code or tests that expect the old error to handle the new InvalidKeyId variant where appropriate (look for usages of validate_kid and pattern matches against AuthError).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cdn/relay/moq-relay.service.tftpl`:
- Line 18: The service template sets the --auth-keys argument to point at a keys
directory (seen in moq-relay.service.tftpl as the '--auth-keys' flag) but
deployment currently provisions flat .jwk files (cdn/relay/justfile) and does
not create that directory (cdn/relay/main.tf), causing runtime kid lookups to
fail; fix by either updating the service template to reference the flat file(s)
pattern used by deployment or update the deployment (justfile/main.tf) to create
and populate the expected keys directory and move/copy the .jwk files into it so
the '--auth-keys' path matches what's actually provisioned.
---
Nitpick comments:
In `@rs/moq-relay/Cargo.toml`:
- Line 44: The reqwest dependency in the moq-relay manifest is declared without
an explicit TLS backend; update the reqwest entry to enable the rustls TLS
backend by declaring reqwest with the "rustls-tls" feature (i.e., change the
version-only declaration of reqwest to a feature-enabled declaration including
"rustls-tls") so it matches the workspace convention used by moq-mux; also
review the existing explicit rustls dependency (if present) and remove or
reconcile it if now redundant.
In `@rs/moq-relay/src/auth.rs`:
- Around line 215-219: The resolver function resolve_file currently calls
Key::from_file(&path) on every verification causing repeated disk reads; change
resolve_file to consult a shared in-memory cache (e.g., a HashMap<PathBuf,
Arc<Key>> behind a RwLock or dashmap) keyed by path/kid and return the cached
Arc<Key> when present, and only call Key::from_file and insert into the cache on
miss; also add an optional invalidation mechanism (file mtime check or use the
notify crate to watch the directory) to evict/update entries when the underlying
.jwk file changes so Key::from_file isn’t repeatedly invoked for every request.
- Around line 160-171: The validate_kid function conflates empty kid and
invalid-character cases by returning AuthError::MissingKeyId for both; add a new
AuthError::InvalidKeyId variant to the AuthError enum, update validate_kid to
return MissingKeyId only when kid.is_empty() and return InvalidKeyId when
characters are disallowed, and then adjust any matching code or tests that
expect the old error to handle the new InvalidKeyId variant where appropriate
(look for usages of validate_kid and pattern matches against AuthError).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 84fd1c0e-ed2c-438f-ad50-5cde2b155571
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
cdn/relay/moq-relay.service.tftpldev/relay/.gitignoredev/relay/justfiledev/relay/leaf0.tomldev/relay/leaf1.tomldev/relay/prod.tomldev/relay/root.tomldoc/app/relay/auth.mdrs/moq-relay/Cargo.tomlrs/moq-relay/src/auth.rsrs/moq-relay/src/connection.rsrs/moq-relay/src/web.rsrs/moq-relay/src/websocket.rsrs/moq-token/src/set.rs
💤 Files with no reviewable changes (1)
- rs/moq-token/src/set.rs
…rectory/URL) - Add --auth-key for single JWK file (no kid required, reads from disk per request) - Keep --auth-keys for directory/URL-based kid resolution - KeySource now has File, Dir, Url, UrlDir variants - Revert dev configs to use single key (dev.jwk) for simplicity - Fix CDN deployment to provision keys/ directory - Generate random key ID when --id not provided - Update auth docs to cover both modes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cdn/relay/justfile`:
- Around line 50-51: The rsync and remote chmod steps (the rsync line and the
ssh "chmod -R 755 ... && chmod 600 ...") fail when ../secrets/*.jwt expands to
no files; update the recipe to either enable shell nullglob before the rsync (so
../secrets/*.jwt expands to an empty list) or make the steps conditional: check
locally for any ../secrets/*.jwt matches and only run the rsync and the remote
chmod for *.jwt when matches exist (while still always copying/chmod-ing .jwk
keys as before). Ensure you adjust both the rsync invocation and the remote ssh
chmod sequence so the jwt-specific chmod is skipped if no JWTs were copied.
- Line 49: Update the rsync invocation that syncs key files so it removes remote
files that no longer exist locally—modify the command "rsync -az
../secrets/keys/ root@$HOST:/var/lib/moq/keys/" to include the --delete option
so revoked/deleted *.jwk files in ../secrets/keys/ are also removed from
/var/lib/moq/keys/ on the remote host.
In `@rs/moq-relay/src/auth.rs`:
- Around line 233-237: KeySource::UrlDir path construction can produce a double
slash when configured base ends with '/', so normalize the base before appending
kid: in the KeySource::UrlDir branch (where base and client are available, and
kid is validated via validate_kid) trim trailing '/' from base (e.g., use
base.trim_end_matches('/')) and then build the URL with a single '/' between
base and "{kid}.jwk" before calling self.resolve_cached_url(&url, client, kid).
Ensure you still call validate_kid(kid) and pass the same kid to
resolve_cached_url.
- Around line 221-230: The code performs blocking file I/O in the async
resolve() flow by calling Key::from_file() directly in KeySource::File and
KeySource::Dir branches and also drops underlying error details; replace both
direct Key::from_file(path) calls with tokio::task::spawn_blocking(move ||
Key::from_file(path)).await (propagate spawn_blocking join errors appropriately)
so the blocking read runs off the Tokio runtime, and when converting errors map
them to AuthError::KeyNotFound while preserving context using .context("failed
to load JWK from file") (via anyhow::Context or equivalent) before converting to
AuthError; keep validate_kid(kid)? and ensure you still return Ok(Arc::new(key))
on success.
- Around line 248-276: Concurrent requests for the same cache_key currently
trigger redundant fetches and background revalidations; introduce a per-key
in-flight tracker (e.g., an async Mutex<HashMap<String,
JoinHandle/oneshot::Receiver<Result<...>>>> or a Futures-aware map) on the
resolver to collapse concurrent work: before spawning a background revalidation
or calling fetch_and_cache, check the in-flight map for an existing in-progress
future for that cache_key and await it instead of starting a new request,
otherwise insert the new future and remove it once complete; update the stale
path (is_stale_usable branch) to only spawn one background revalidation per key
and update the miss/expired path to await an existing in-flight fetch_or_insert
a new fetch via fetch_and_cache, referencing resolver.fetch_and_cache, cache,
cache_key, and the background tokio::spawn usage to locate where to add the
in-flight coordination.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ebc82753-39a0-4d8a-b0aa-87fca08cef78
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (9)
cdn/relay/justfiledev/relay/justfiledev/relay/leaf0.tomldev/relay/leaf1.tomldev/relay/root.tomldoc/app/relay/auth.mdrs/moq-relay/src/auth.rsrs/moq-token-cli/Cargo.tomlrs/moq-token-cli/src/main.rs
✅ Files skipped from review due to trivial changes (1)
- rs/moq-token-cli/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (5)
- dev/relay/leaf1.toml
- dev/relay/root.toml
- dev/relay/leaf0.toml
- dev/relay/justfile
- doc/app/relay/auth.md
| rsync -az ../secrets/ root@$HOST:/var/lib/moq/ | ||
| ssh root@$HOST "chmod -R 755 /var/lib/moq && chmod 600 /var/lib/moq/*.jwk /var/lib/moq/*.jwt" | ||
| ssh root@$HOST "install -d -m 755 /var/lib/moq/keys" | ||
| rsync -az ../secrets/keys/ root@$HOST:/var/lib/moq/keys/ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file exists and read the relevant section
cat -n cdn/relay/justfile | head -60Repository: moq-dev/moq
Length of output: 2479
🏁 Script executed:
# Also check the context around line 49 to understand the full deploy recipe
sed -n '40,60p' cdn/relay/justfileRepository: moq-dev/moq
Length of output: 1050
🏁 Script executed:
# Look for any key resolver or key validation logic that might be related
rg -A 5 -B 5 "kid.*jwk|jwk.*resolver" --type-list | head -30Repository: moq-dev/moq
Length of output: 444
🏁 Script executed:
# Check if there are other key management scripts or comments about key rotation
fd -type f -name "*.justfile" -o -name "*deploy*" -o -name "*keys*" | head -20Repository: moq-dev/moq
Length of output: 225
🏁 Script executed:
# Search for key resolver implementation and jwk handling
rg -i "kid.*jwk|resolver.*key|\.jwk" --type rust --type ts --type js -A 3 -B 1 | head -80Repository: moq-dev/moq
Length of output: 3644
🏁 Script executed:
# Look for secrets directory structure and key management docs
find . -type f -name "*.md" -o -name "README*" | xargs grep -l "key\|secret" 2>/dev/null | head -10Repository: moq-dev/moq
Length of output: 282
🏁 Script executed:
# Check if there are any comments or docs about key rotation
rg -i "rotation|revok|retired" --type md --type txt -B 2 -A 2 | head -60Repository: moq-dev/moq
Length of output: 3892
🏁 Script executed:
# Search for moq-relay or moq-cert source to understand key loading
fd -type f \( -name "*.rs" -o -name "*.ts" \) | xargs grep -l "\.jwk\|kid" 2>/dev/null | head -15Repository: moq-dev/moq
Length of output: 225
🏁 Script executed:
# Check auth documentation
cat doc/app/relay/auth.mdRepository: moq-dev/moq
Length of output: 6065
🏁 Script executed:
# Look at the auth.rs file more carefully to see the full KeyResolver implementation
sed -n '1,150p' rs/moq-relay/src/auth.rsRepository: moq-dev/moq
Length of output: 4437
🏁 Script executed:
# Check if there's any cache busting or key refresh mechanism
rg -i "cache|refresh|reload" rs/moq-relay/src/auth.rs -B 2 -A 5Repository: moq-dev/moq
Length of output: 4179
Add --delete to the keys rsync to remove revoked keys.
Without it, if a *.jwk is deleted from ../secrets/keys/ during rotation or revocation, the old key file persists under /var/lib/moq/keys/ and continues to be accepted by the key resolver on every token validation request.
🛠️ Proposed fix
- rsync -az ../secrets/keys/ root@$HOST:/var/lib/moq/keys/
+ rsync -az --delete ../secrets/keys/ root@$HOST:/var/lib/moq/keys/📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rsync -az ../secrets/keys/ root@$HOST:/var/lib/moq/keys/ | |
| rsync -az --delete ../secrets/keys/ root@$HOST:/var/lib/moq/keys/ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cdn/relay/justfile` at line 49, Update the rsync invocation that syncs key
files so it removes remote files that no longer exist locally—modify the command
"rsync -az ../secrets/keys/ root@$HOST:/var/lib/moq/keys/" to include the
--delete option so revoked/deleted *.jwk files in ../secrets/keys/ are also
removed from /var/lib/moq/keys/ on the remote host.
| rsync -az ../secrets/*.jwt root@$HOST:/var/lib/moq/ | ||
| ssh root@$HOST "chmod -R 755 /var/lib/moq && chmod 600 /var/lib/moq/keys/*.jwk /var/lib/moq/*.jwt" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the justfile
find . -type f -name "justfile" | head -20Repository: moq-dev/moq
Length of output: 215
🏁 Script executed:
# Search for the file in cdn/relay
ls -la cdn/relay/justfile 2>/dev/null || find . -path "*/cdn/relay/justfile" 2>/dev/nullRepository: moq-dev/moq
Length of output: 122
🏁 Script executed:
# If found, read the justfile content around lines 50-51
if [ -f "cdn/relay/justfile" ]; then
echo "=== Full file context ===" && \
cat -n cdn/relay/justfile && \
echo "" && \
echo "=== Lines around 50-51 ===" && \
sed -n '45,55p' cdn/relay/justfile
fiRepository: moq-dev/moq
Length of output: 4758
🏁 Script executed:
# Also check for any shopt settings in the file
rg "shopt|nullglob|failglob" cdn/relay/justfileRepository: moq-dev/moq
Length of output: 37
Make the JWT copy/chmod steps tolerant of zero matches.
This recipe runs under Bash without nullglob enabled. When ../secrets/*.jwt has no matches (e.g., in a key-only deployment), the glob pattern is passed literally to rsync, causing the command to fail. Line 51's chmod command on the remote side will similarly fail if no JWT files were copied. Use shopt -s nullglob in the recipe or make the steps conditional to handle deployments containing only .jwk files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cdn/relay/justfile` around lines 50 - 51, The rsync and remote chmod steps
(the rsync line and the ssh "chmod -R 755 ... && chmod 600 ...") fail when
../secrets/*.jwt expands to no files; update the recipe to either enable shell
nullglob before the rsync (so ../secrets/*.jwt expands to an empty list) or make
the steps conditional: check locally for any ../secrets/*.jwt matches and only
run the rsync and the remote chmod for *.jwt when matches exist (while still
always copying/chmod-ing .jwk keys as before). Ensure you adjust both the rsync
invocation and the remote ssh chmod sequence so the jwt-specific chmod is
skipped if no JWTs were copied.
| KeySource::File(path) => { | ||
| let key = Key::from_file(path).map_err(|_| AuthError::KeyNotFound)?; | ||
| Ok(Arc::new(key)) | ||
| } | ||
| KeySource::Dir(dir) => { | ||
| let kid = kid.ok_or(AuthError::MissingKeyId)?; | ||
| validate_kid(kid)?; | ||
| let path = dir.join(format!("{kid}.jwk")); | ||
| let key = Key::from_file(&path).map_err(|_| AuthError::KeyNotFound)?; | ||
| Ok(Arc::new(key)) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find and read the auth.rs file around the specified lines
fd -n "auth.rs" rs/moq-relay/src/Repository: moq-dev/moq
Length of output: 285
🏁 Script executed:
# Read the file around lines 221-230 with context
sed -n '200,250p' rs/moq-relay/src/auth.rs | cat -nRepository: moq-dev/moq
Length of output: 1776
🏁 Script executed:
# Check the Key::from_file definition to confirm if it's synchronous
rg -A 10 "fn from_file" rs/moq-relay/src/ --type rustRepository: moq-dev/moq
Length of output: 37
🏁 Script executed:
# Examine the full resolve function signature and context
sed -n '200,240p' rs/moq-relay/src/auth.rs | cat -nRepository: moq-dev/moq
Length of output: 1480
🏁 Script executed:
# Look at the broader context of the resolve method and who calls it
rg -B 5 "fn resolve" rs/moq-relay/src/auth.rs | head -40Repository: moq-dev/moq
Length of output: 299
🏁 Script executed:
# Find Key::from_file definition to confirm it's synchronous
rg "fn from_file" rs/moq-relay/src/ --type rust -A 5Repository: moq-dev/moq
Length of output: 37
🏁 Script executed:
# Also search in the moq_token module if it exists
fd -type f "*.rs" | xargs rg "fn from_file" -A 5Repository: moq-dev/moq
Length of output: 1055
🏁 Script executed:
# Check who calls resolve() to understand the context
rg "\.resolve\(" rs/moq-relay/src/auth.rs -B 2 -A 2Repository: moq-dev/moq
Length of output: 229
🏁 Script executed:
# Look at the verify() function to see how resolve() is called
rg -A 20 "fn verify" rs/moq-relay/src/auth.rs | head -50Repository: moq-dev/moq
Length of output: 888
🏁 Script executed:
# Check imports to understand the error types and Key source
head -50 rs/moq-relay/src/auth.rsRepository: moq-dev/moq
Length of output: 1378
Use tokio::task::spawn_blocking() for synchronous file I/O and preserve error context with .context().
resolve() is an async function called on the connection auth path, but lines 23 and 30 perform synchronous disk I/O via Key::from_file() (which uses std::fs::read_to_string()). This blocks the Tokio task without yielding, starving other concurrent work during high reconnect churn.
Additionally, both calls use .map_err(|_| AuthError::KeyNotFound)?, discarding the underlying error information. Per coding guidelines, use .context("msg") to preserve error context for debugging key load failures.
Wrap the from_file() calls with tokio::task::spawn_blocking() or refactor to async I/O; preserve error context by adding .context() before type conversion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rs/moq-relay/src/auth.rs` around lines 221 - 230, The code performs blocking
file I/O in the async resolve() flow by calling Key::from_file() directly in
KeySource::File and KeySource::Dir branches and also drops underlying error
details; replace both direct Key::from_file(path) calls with
tokio::task::spawn_blocking(move || Key::from_file(path)).await (propagate
spawn_blocking join errors appropriately) so the blocking read runs off the
Tokio runtime, and when converting errors map them to AuthError::KeyNotFound
while preserving context using .context("failed to load JWK from file") (via
anyhow::Context or equivalent) before converting to AuthError; keep
validate_kid(kid)? and ensure you still return Ok(Arc::new(key)) on success.
| // Check cache first (read lock) | ||
| { | ||
| let cache = self.cache.read().await; | ||
| if let Some(cached) = cache.get(cache_key) { | ||
| if cached.is_fresh() { | ||
| return Ok(cached.key.clone()); | ||
| } | ||
| if cached.is_stale_usable() { | ||
| let key = cached.key.clone(); | ||
| let etag = cached.etag.clone(); | ||
| // Spawn background revalidation | ||
| let resolver = Arc::clone(self); | ||
| let url = url.to_owned(); | ||
| let client = client.clone(); | ||
| let cache_key = cache_key.to_owned(); | ||
| drop(cache); | ||
| tokio::spawn(async move { | ||
| if let Err(e) = resolver.fetch_and_cache(&url, &client, &cache_key, etag).await { | ||
| tracing::debug!(%cache_key, %e, "background key revalidation failed"); | ||
| } | ||
| }); | ||
| return Ok(key); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Cache miss or expired beyond SWR — fetch synchronously | ||
| self.fetch_and_cache(url, client, cache_key, None).await | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd -t f 'auth.rs' rs/moq-relay/src/Repository: moq-dev/moq
Length of output: 79
🏁 Script executed:
cat -n rs/moq-relay/src/auth.rs | head -300Repository: moq-dev/moq
Length of output: 10780
🏁 Script executed:
cat -n rs/moq-relay/src/auth.rs | sed -n '300,350p'Repository: moq-dev/moq
Length of output: 1852
🏁 Script executed:
cat -n rs/moq-relay/src/auth.rs | sed -n '350,400p'Repository: moq-dev/moq
Length of output: 1659
🏁 Script executed:
rg -n "in_flight|pending|fetch" rs/moq-relay/src/auth.rs -A 2 -B 2Repository: moq-dev/moq
Length of output: 1997
🏁 Script executed:
# Check if there are any tests that verify concurrent behavior
rg -n "#\[tokio::test\]" rs/moq-relay/src/ -A 20 | head -100Repository: moq-dev/moq
Length of output: 5597
🏁 Script executed:
rg -n "concurrent|in.flight|dedup|collapse" rs/moq-relay/src/auth.rsRepository: moq-dev/moq
Length of output: 37
🏁 Script executed:
# Check if there are tests for the resolve_cached_url function specifically
rg -n "resolve_cached_url|test.*fetch" rs/moq-relay/src/auth.rs -A 10Repository: moq-dev/moq
Length of output: 756
🏁 Script executed:
# Let's look for any recent changes or TODOs
rg -n "TODO|FIXME|XXX" rs/moq-relay/src/auth.rsRepository: moq-dev/moq
Length of output: 37
🏁 Script executed:
# Check the context around resolve method to understand the call patterns
cat -n rs/moq-relay/src/auth.rs | sed -n '219,240p'Repository: moq-dev/moq
Length of output: 1017
🏁 Script executed:
# Check if there's any documentation or discussion about this being a known issue
git log --oneline -20 rs/moq-relay/src/auth.rsRepository: moq-dev/moq
Length of output: 144
🏁 Script executed:
# Let's examine the full resolve_cached_url implementation once more to confirm the issue
cat -n rs/moq-relay/src/auth.rs | sed -n '242,276p'Repository: moq-dev/moq
Length of output: 1322
🏁 Script executed:
# Check how many concurrent auth requests might happen - look at where resolve is called
rg -n "\.resolve\(" rs/moq-relay/src/ -B 2 -A 2Repository: moq-dev/moq
Length of output: 374
🏁 Script executed:
# Let's check the KeyResolver struct definition to see all its fields
cat -n rs/moq-relay/src/auth.rs | sed -n '168,172p'Repository: moq-dev/moq
Length of output: 177
🏁 Script executed:
# Let's see if there are any synchronization primitives beyond the cache RwLock
rg -n "RwLock|Mutex|Arc" rs/moq-relay/src/auth.rs | head -20Repository: moq-dev/moq
Length of output: 717
🏁 Script executed:
# Let's verify the exact problem by creating a simple test scenario analysis
# Check the code flow once more - when multiple tasks call resolve_cached_url for same cache_key
cat -n rs/moq-relay/src/auth.rs | sed -n '278,294p'Repository: moq-dev/moq
Length of output: 615
Collapse concurrent fetches per cache key.
Multiple concurrent requests for the same cache key will each issue independent HTTP requests. When a cache entry is stale (lines 255–268), every concurrent request spawns a separate background revalidation task. When a cache entry is missing or expired beyond SWR (line 275), every concurrent request calls fetch_and_cache independently, resulting in redundant HTTP calls. Under reconnect bursts or high concurrency, this amplifies load on the key endpoint with redundant fetches for the same key.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rs/moq-relay/src/auth.rs` around lines 248 - 276, Concurrent requests for the
same cache_key currently trigger redundant fetches and background revalidations;
introduce a per-key in-flight tracker (e.g., an async Mutex<HashMap<String,
JoinHandle/oneshot::Receiver<Result<...>>>> or a Futures-aware map) on the
resolver to collapse concurrent work: before spawning a background revalidation
or calling fetch_and_cache, check the in-flight map for an existing in-progress
future for that cache_key and await it instead of starting a new request,
otherwise insert the new future and remove it once complete; update the stale
path (is_stale_usable branch) to only spawn one background revalidation per key
and update the miss/expired path to await an existing in-flight fetch_or_insert
a new fetch via fetch_and_cache, referencing resolver.fetch_and_cache, cache,
cache_key, and the background tokio::spawn usage to locate where to add the
in-flight coordination.
- Add `KeyId` struct in moq-token with validation (alphanumeric, hyphen, underscore) - `KeyId::random()` generates cryptographically random 16-hex-char IDs - Refactor moq-token-cli: separate --key per subcommand, add --out and --out-dir - Rename --auth-keys to --auth-key-dir for clarity - CDN: revert to single --auth-key for now (URL keys later) - JS token: auto-generate random kid when none provided - Rewrite auth docs covering both single-key and key-dir modes - Remove manual HTTP cache code (incompatible with reqwest 0.13) - Use url::Url::parse instead of string prefix matching Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
doc/app/relay/auth.md (1)
202-208:⚠️ Potential issue | 🟡 Minor
--expires 3600creates an already-expired token.The CLI parses
--expiresas an absolute Unix timestamp, so3600means January 1, 1970 01:00:00 UTC, not “one hour from now”. The example needsnow + 3600, not a duration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@doc/app/relay/auth.md` around lines 202 - 208, The example uses moq-token-cli sign with the --expires flag incorrectly as a duration; update the example so --expires is set to the current Unix timestamp plus 3600 seconds (i.e., now + 3600) rather than the literal 3600 so the token expires one hour from now; change the documented invocations/examples of the moq-token-cli sign command to compute now and add 3600 for the --expires value (use a shell or script expression to calculate current epoch time + 3600) and replace the existing --expires 3600 usage accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@doc/app/relay/auth.md`:
- Around line 41-45: The example for `moq-token-cli generate` writes the public
key to ./keys/my-key.jwk but omits a `kid`, causing a mismatch between the
generated random `kid` and the expected filename; update the example invocation
of `moq-token-cli generate` (the `--algorithm ES256 --out private.jwk --public
./keys/my-key.jwk` line) to include an explicit `--id my-key` so the generated
`kid` matches the public key file name, or document that the public filename
must be derived from the generated `kid` if `--id` is not provided.
In `@js/token/src/generate.ts`:
- Around line 17-22: In generate(algorithm: Algorithm, kid?: string) ensure kid
is validated: trim the input, treat empty-string or all-whitespace as "not
provided" by assigning randomKid(), and validate non-empty kids against the
allowed pattern /^[A-Za-z0-9_-]+$/; if validation fails throw a descriptive
error. Update the logic in generate (and any call sites relying on randomKid) so
you don't rely on nullish coalescing for emptiness and enforce the regex before
constructing the returned Key (so helpers that use ...(kid && { kid }) receive
only valid kids).
In `@rs/moq-relay/src/auth.rs`:
- Around line 97-102: KeyResolver currently only stores `source` and calls
`fetch_key()` for URL sources, which does unconditional GETs; fix by adding an
internal cache map to `KeyResolver` (e.g., HashMap<String, CachedEntry> keyed by
full URL or `{source}/{kid}.jwk`) that stores ETag/Last-Modified and the cached
key bytes, update `fetch_key()` (or add a new method used by
`KeyResolver::resolve`/wherever `fetch_key` is invoked) to send conditional
requests using If-None-Match / If-Modified-Since headers from the cache, on 200
update the cache entry with new ETag/Last-Modified and body, on 304 return the
cached body, and ensure the cache is wrapped appropriately (Arc<Mutex>/RwLock)
if KeyResolver is shared across threads; touch the KeyResolver struct, its
constructor, and the `fetch_key` call sites so URL branches use the cached
lookup instead of unconditional GETs.
- Around line 177-196: fetch_key currently parses the HTTP response as raw JSON
with serde_json::from_str, but directory mode uses Key::from_file to decode the
repo's base64url-on-disk format, so static {kid}.jwk payloads fail with
DecodeFailed; update fetch_key to parse the response using the same decoding
path as Key::from_file (or extract/call the shared parsing helper used by
Key::from_file) instead of serde_json::from_str, preserving the same AuthError
mapping (e.g., return AuthError::DecodeFailed on parse/decoding errors) and keep
the existing tracing warnings in the fetch_key async function.
- Around line 248-253: The URL parsed from config.key_dir (via parse_url) can be
treated as a file path so base.join("{kid}.jwk") drops the last segment; to fix,
normalize the Url returned for KeySource::UrlDir so it ends with a trailing
slash before storing it in KeySource::UrlDir (when constructing that variant in
the code path using parse_url(&key_dir)), e.g. detect missing trailing slash on
the Url and append one (preserving query/host) before creating the
KeySource::UrlDir with base and build_http_client().
In `@rs/moq-token/src/key.rs`:
- Around line 159-161: The test assertion expecting the debug string for the
struct needs updating because the field kid is now Option<KeyId> (a tuple
struct), so its Debug output is "kid: Some(KeyId(\"test-key-1\"))" instead of
"kid: Some(\"test-key-1\")"; update the test assertion that checks the Debug
representation to expect the KeyId wrapper (look for assertions referencing kid
and KeyId in the test around the failing assertion) or format the expected
string using KeyId::from("test-key-1")'s Debug to produce the correct wrapped
output.
---
Outside diff comments:
In `@doc/app/relay/auth.md`:
- Around line 202-208: The example uses moq-token-cli sign with the --expires
flag incorrectly as a duration; update the example so --expires is set to the
current Unix timestamp plus 3600 seconds (i.e., now + 3600) rather than the
literal 3600 so the token expires one hour from now; change the documented
invocations/examples of the moq-token-cli sign command to compute now and add
3600 for the --expires value (use a shell or script expression to calculate
current epoch time + 3600) and replace the existing --expires 3600 usage
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e4d33f31-ab6d-4e90-9800-5432c93899aa
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (13)
dev/relay/justfiledev/relay/prod.tomldoc/app/relay/auth.mdjs/token/src/cli.tsjs/token/src/generate.tsrs/moq-relay/src/auth.rsrs/moq-token-cli/src/main.rsrs/moq-token/Cargo.tomlrs/moq-token/src/generate.rsrs/moq-token/src/key.rsrs/moq-token/src/key_id.rsrs/moq-token/src/lib.rsrs/moq-token/src/set.rs
✅ Files skipped from review due to trivial changes (4)
- rs/moq-token/Cargo.toml
- js/token/src/cli.ts
- rs/moq-token/src/set.rs
- dev/relay/justfile
🚧 Files skipped from review as they are similar to previous changes (1)
- dev/relay/prod.toml
| /// A directory path or base URL containing JWK files named by key ID. | ||
| /// | ||
| /// File path: reads `{dir}/{kid}.jwk` from disk. | ||
| /// URL: fetches `{url}/{kid}.jwk` with HTTP caching. | ||
| #[arg(long = "auth-key-dir", env = "MOQ_AUTH_KEY_DIR")] | ||
| pub key_dir: Option<String>, |
There was a problem hiding this comment.
URL-backed key resolution is still uncached.
KeyResolver only stores source, and both URL branches go straight to fetch_key(), which always performs an unconditional GET. There is nowhere to retain Cache-Control/ETag state, so the newly documented HTTP-cache behavior is not implemented and every authenticated connection now depends on the remote key service.
Also applies to: 145-147, 167-172, 177-198
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rs/moq-relay/src/auth.rs` around lines 97 - 102, KeyResolver currently only
stores `source` and calls `fetch_key()` for URL sources, which does
unconditional GETs; fix by adding an internal cache map to `KeyResolver` (e.g.,
HashMap<String, CachedEntry> keyed by full URL or `{source}/{kid}.jwk`) that
stores ETag/Last-Modified and the cached key bytes, update `fetch_key()` (or add
a new method used by `KeyResolver::resolve`/wherever `fetch_key` is invoked) to
send conditional requests using If-None-Match / If-Modified-Since headers from
the cache, on 200 update the cache entry with new ETag/Last-Modified and body,
on 304 return the cached body, and ensure the cache is wrapped appropriately
(Arc<Mutex>/RwLock) if KeyResolver is shared across threads; touch the
KeyResolver struct, its constructor, and the `fetch_key` call sites so URL
branches use the cached lookup instead of unconditional GETs.
Downgrade reqwest to 0.12 (matching moq-mux/moq-native) to enable http-cache-reqwest with moka in-memory backend. This replaces manual Cache-Control/ETag handling with RFC 7234 compliant HTTP caching. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The old format `moq-token-cli --key <path> <command>` still works but prints a deprecation warning to stderr. The --key is threaded through as a fallback for sign/verify --key and generate --out. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add --public-dir to save public key as {kid}.jwk in a directory
- Remove backwards-compat --key global arg (breaking, nobody uses it)
- Note in docs that key_dir should point to public keys for asymmetric
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
rs/moq-relay/src/auth.rs (2)
164-173:⚠️ Potential issue | 🟠 MajorBlocking file I/O in async
resolve()blocks the Tokio runtime.
Key::from_file()performs synchronous disk I/O (std::fs::read_to_string). Calling it directly in an async function without yielding can starve other concurrent tasks during high reconnect churn.Additionally,
.map_err(|_| AuthError::KeyNotFound)discards the underlying error, making it harder to debug key load failures in production.🛠️ Proposed fix: use spawn_blocking and preserve error context
KeySource::File(path) => { - let key = Key::from_file(path).map_err(|_| AuthError::KeyNotFound)?; - Ok(Arc::new(key)) + let path = path.clone(); + let key = tokio::task::spawn_blocking(move || Key::from_file(&path)) + .await + .map_err(|e| { + tracing::warn!(%e, "spawn_blocking failed"); + AuthError::KeyNotFound + })? + .map_err(|e| { + tracing::warn!(path = %path.display(), %e, "failed to load key"); + AuthError::KeyNotFound + })?; + Ok(Arc::new(key)) } KeySource::Dir(dir) => { let kid = kid.ok_or(AuthError::MissingKeyId)?; let kid = KeyId::decode(kid).map_err(|_| AuthError::MissingKeyId)?; let path = dir.join(format!("{kid}.jwk")); - let key = Key::from_file(&path).map_err(|_| AuthError::KeyNotFound)?; - Ok(Arc::new(key)) + let key = tokio::task::spawn_blocking(move || Key::from_file(&path)) + .await + .map_err(|e| { + tracing::warn!(%e, "spawn_blocking failed"); + AuthError::KeyNotFound + })? + .map_err(|e| { + tracing::warn!(path = %path.display(), %e, "failed to load key"); + AuthError::KeyNotFound + })?; + Ok(Arc::new(key)) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rs/moq-relay/src/auth.rs` around lines 164 - 173, The resolve() async path is performing synchronous disk I/O via Key::from_file() inside KeySource::File and KeySource::Dir which can block the Tokio runtime and is also discarding underlying errors; change both branches to run the file load inside tokio::task::spawn_blocking (await the JoinHandle) and then wrap the resulting Key in Arc::new(...) so behavior is unchanged, and preserve the original error when mapping to AuthError (e.g., include the spawned task error or the underlying io/parse error in AuthError::KeyNotFound or a new variant) instead of using map_err(|_| ...), and handle spawn_blocking JoinError by mapping it into an AuthError as well so resolve() remains async-safe and debuggable.
264-269:⚠️ Potential issue | 🟠 MajorURL base path must be normalized to directory form with trailing slash.
When
key_diris a URL likehttps://example.com/keys(without trailing slash),Url::join()at line 179 treatskeysas a filename and replaces it, producinghttps://example.com/{kid}.jwkinstead ofhttps://example.com/keys/{kid}.jwk.🛠️ Proposed fix: ensure trailing slash before storing URL
} else if let Some(key_dir) = config.key_dir { let source = if let Some(url) = parse_url(&key_dir) { + // Ensure trailing slash so Url::join() treats it as a directory + let mut base = url; + if !base.path().ends_with('/') { + base.set_path(&format!("{}/", base.path())); + } KeySource::UrlDir { - base: url, + base, client: build_http_client()?, } } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rs/moq-relay/src/auth.rs` around lines 264 - 269, Normalize the URL used for KeySource::UrlDir so it always ends with a trailing slash before storing it: after parse_url(&key_dir) returns a Url, ensure the path component ends with '/' (or append one via Url::set_path or by converting to string and pushing '/') so that later Url::join calls in the kid resolution code produce directory joins (e.g., base.join("{kid}.jwk") -> https://example.com/keys/{kid}.jwk). Update the code that constructs KeySource::UrlDir (where parse_url(&key_dir) is used) to use this normalized Url.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@rs/moq-relay/src/auth.rs`:
- Around line 164-173: The resolve() async path is performing synchronous disk
I/O via Key::from_file() inside KeySource::File and KeySource::Dir which can
block the Tokio runtime and is also discarding underlying errors; change both
branches to run the file load inside tokio::task::spawn_blocking (await the
JoinHandle) and then wrap the resulting Key in Arc::new(...) so behavior is
unchanged, and preserve the original error when mapping to AuthError (e.g.,
include the spawned task error or the underlying io/parse error in
AuthError::KeyNotFound or a new variant) instead of using map_err(|_| ...), and
handle spawn_blocking JoinError by mapping it into an AuthError as well so
resolve() remains async-safe and debuggable.
- Around line 264-269: Normalize the URL used for KeySource::UrlDir so it always
ends with a trailing slash before storing it: after parse_url(&key_dir) returns
a Url, ensure the path component ends with '/' (or append one via Url::set_path
or by converting to string and pushing '/') so that later Url::join calls in the
kid resolution code produce directory joins (e.g., base.join("{kid}.jwk") ->
https://example.com/keys/{kid}.jwk). Update the code that constructs
KeySource::UrlDir (where parse_url(&key_dir) is used) to use this normalized
Url.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 36a125e8-d101-4e99-96ae-4f665fda5963
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
rs/moq-relay/Cargo.tomlrs/moq-relay/src/auth.rsrs/moq-token/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (2)
- rs/moq-token/Cargo.toml
- rs/moq-relay/Cargo.toml
- JS: Add KeyIdSchema (Zod) with regex validation, typed KeyId - JS: generate() validates kid via Zod, always passes KeyId to helpers - Rust: Add Key::from_file_async behind tokio feature flag - Rust: moq-relay uses from_file_async for non-blocking key reads - Rust: AuthError uses #[from] for KeyIdError instead of map_err Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add 128-char max length to KeyId to prevent abuse via long kid headers - Ensure UrlDir base URL has trailing slash so Url::join appends correctly - Fix test_key_debug_no_secret to match new KeyId Debug format Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Files use base64url-encoded JWK (matching moq-token-cli output). URL endpoints serve plain JSON. Both are correct for their context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Key files now output as JSON (human-readable, standard JWK tooling) - from_file/load auto-detect JSON vs base64url for backwards compat - Add --base64 flag to CLI for optional base64url output - Add to_file_base64url for explicit base64url file writes - JS cli.ts and key.ts updated to match Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # doc/app/relay/auth.md
Merge main's guest/guest_sub/guest_pub fields into our KeyId/auth-key-dir changes. Both features are orthogonal and work together. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add std::time::Duration qualifier in moq-token load_keys function - Add #[non_exhaustive] to all public thiserror enums (AuthError, KeyIdError, moq-mux Error, cmaf Error) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document thiserror + #[from] for libraries, anyhow for binaries, and #[non_exhaustive] for public thiserror enums. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # js/token/src/key.ts
- Resolve merge conflict in js/token/src/key.ts (zod/mini API style) - Fix KeyIdSchema to use z.refine() instead of .regex() for zod/mini compat - Set reqwest to use rustls-tls instead of default OpenSSL in moq-relay and moq-token, fixing linker errors from OpenSSL/BoringSSL symbol clash Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test was calling to_file (writes JSON) but asserting the output contained no JSON characters, which only holds for base64url encoding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # Cargo.lock # rs/moq-token/Cargo.toml # rs/moq-token/src/generate.rs # rs/moq-token/src/key.rs # rs/moq-token/src/set.rs
moq-token switched from anyhow to thiserror on main. Adapt CLI call sites to use ? operator to convert crate::Result to anyhow::Result. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
--auth-key(single key file/URL loaded at startup) with--auth-keys(directory or base URL for on-demand key lookup bykid)kidfrom the JWT header and fetching{source}/{kid}.jwkCache-Control,stale-while-revalidate, andETag--auth-refresh-intervaland all background refresh logicTest plan
just checkpassesmoq-relay --auth-keys /tmp/keys --auth-public ""starts and accepts connections🤖 Generated with Claude Code