Release/v0.7.0#287
Open
pigri wants to merge 245 commits into
Open
Conversation
…l to limit update frequency to once every 7 days
Switch synapse to consume the capture-only dendrite at ../dendrite-0427/crates/dendrite via [patch], drop the firewall and bpf-windows feature flags that were removed from dendrite, and stub the Windows xdp_blocker types (real impl arrives in PR 2 from migrate-to-synapse/bpf_windows_blockers/). - Cargo.toml: dendrite -> local path dep with the new feature set (drop firewall, bpf-windows; keep capture, ndis-capture; add windows-afxdp-capture, ssh-collector). [patch] redirects the dendrite git refs in synapse-events and occipital to the same local path so the workspace resolves a single dendrite version. - crates/synapse-events: remove firewall-bridge feature and the dead PacketMatcher impl on dendrite::firewall::FingerprintMatcher (gone). - crates/occipital: keep the firewall feature as a placeholder so the cfg gates in tui.rs don't warn; PR 2 will rewire them to a synapse- internal matcher. - src/platform/windows/xdp_blocker.rs: replace the dendrite re-export with stub types (XdpBlockMode, WindowsXdpBlocker) that always error at runtime. Keeps thalamus_ids_windows.rs compiling; PR 2 replaces. - src/platform/windows/mod.rs: point xdp_blocker tests at the local stub. - src/bin/afpacket_diag.rs: convert to a thin Linux/Windows dispatcher. Linux body moves to src/afpacket_diag_impl.rs (outside src/bin/ so cargo's auto-bin-discovery doesn't pick it up). Fixes pre-existing Windows compile errors caused by AF_PACKET / sockaddr_ll / setsid. cargo check passes on Windows. Functional changes (subscribe to dendrite::CaptureEvent, delete the polling code in fingerprint_writer and tcp_fingerprint) are PR1.1 -- still to come.
… (CaptureBackend) - Add src/capture/ndis_subscriber.rs: Windows-only NdisSubscriber that merges all NDIS adapter CaptureEnvelope streams and routes to fingerprint log + corpus_callosum - Replace the 52-line inline capture.run(callback) block in main.rs with a 10-line NdisSubscriber::start() / std::thread::spawn(subscriber.run()) call - Log message updated: "NDIS fingerprint capture started (CaptureBackend)" - Tested on Azure VM: 20 fingerprint log entries, same JA4T/JA4TS hashes as before
…e-to-synapse PR 2 of the synapse refactor. Copies the staged enforcement code from dendrite-0427/migrate-to-synapse/bpf_windows_blockers/ into src/platform/windows/firewall/ and wires it in: - New module src/platform/windows/firewall/ with DatapathController, WindowsXdpBlocker (FFI), WindowsEbpfBlocker (FFI), and their C shims - WindowsDatapathConfig/Mode/Preference config types defined in firewall/mod.rs - build.rs: cc::Build for xdp_block_shim.c and ebpf_block_shim.c (Windows only) - Cargo.toml: add cc = "1" to [build-dependencies] - thalamus_ids_windows.rs: import path updated to firewall::xdp_blocker - Deleted stub src/platform/windows/xdp_blocker.rs (replaced by real impl) Verified: cargo build --release succeeds; synapse agent starts cleanly on Azure VM with NDIS capture active and no regressions.
Follow-up to PR 2 commit (3a98a3e): stage the stub file deletion that was overlooked in the explicit git add, and record the cc = "1" build-dependency in Cargo.lock.
…napse-core Step 1 of PR 3 (per-OS workspace split): - Add three new workspace members: synapse-core, synapse-linux, synapse-windows - Move src/core/cli.rs → crates/synapse-core/src/core/cli.rs - Inline CaptchaProvider enum (was imported from WAF captcha module) - Inline FirewallMode enum (was duplicated in firewall + noop modules) - Gate ProxyConfig::to_app_config on cfg(all(unix, feature = "proxy")) - src/core/mod.rs: re-export synapse_core::core::cli so existing crate::core::cli paths compile unchanged throughout the rest of the codebase - captcha.rs: replace CaptchaProvider definition with re-export from synapse-core - firewall/mod.rs + firewall_noop.rs: replace FirewallMode definition with re-export - parceyaml.rs: gate the to_app_config shortcut on cfg(all(unix, feature = "proxy")) cargo check --workspace passes on Windows.
Move three OS-agnostic modules from the root crate into synapse-core: - src/utils/maxmind.rs -> synapse-core/src/utils/maxmind.rs - src/security/geoip/mod.rs -> synapse-core/src/security/geoip/mod.rs - src/security/threat/mod.rs -> synapse-core/src/security/threat/mod.rs Root crate stubs replaced with pub use synapse_core::* re-exports so all existing crate::security::geoip / crate::security::threat / crate::utils::maxmind paths continue to resolve without touching any caller. Added corpus-callosum, maxminddb, memmap2, arc-swap, chrono, dashmap, pingora-memory-cache, and tokio to synapse-core dependencies. cargo check --workspace passes on Windows.
Move three more OS-agnostic modules from the root crate into synapse-core: - src/utils/http_client.rs -> synapse-core/src/utils/http_client.rs - src/platform/agent_status.rs -> synapse-core/src/platform/agent_status.rs - src/platform/authcheck.rs -> synapse-core/src/platform/authcheck.rs Root stubs replaced with pub use synapse_core::* re-exports. Added reqwest, sha2, hex deps to synapse-core. cargo check --workspace passes on Windows.
…imports
Direct path dep `../../dendrite-0427/crates/dendrite` resolved incorrectly
(landed inside synapse-0427/ subtree). Switch to the git source intercepted
by the workspace [patch] entry, matching how synapse-events/occipital do it.
Also removes the now-unused `use serde::{Deserialize, Serialize}` from
firewall/mod.rs and firewall_noop.rs — those derives moved to synapse-core.
Moves the following OS-agnostic modules into synapse-core, replacing root files with one-line pub-use stubs so existing crate:: paths resolve unchanged: - src/utils/path_sanitize.rs → crates/synapse-core/src/utils/path_sanitize.rs - src/utils/state.rs → crates/synapse-core/src/utils/state.rs - src/utils/tls_fingerprint.rs → crates/synapse-core/src/utils/tls_fingerprint.rs - src/utils/tls_client_hello.rs → crates/synapse-core/src/utils/tls_client_hello.rs - src/logger/fingerprint_log.rs → crates/synapse-core/src/logger/fingerprint_log.rs tls_fingerprint and tls_client_hello use dendrite::Ja4/JA4S types; wired via the git+patch dep added in the previous commit.
Moves worker/manager.rs (WorkerManager, PeriodicWorker, PeriodicTask, etc.) into synapse-core — it has no platform deps, only std + tokio. Root src/worker/manager.rs becomes a one-line pub-use stub so all existing crate::worker:: paths resolve unchanged.
…core Both workers only reference types that are already in synapse-core: http_client, worker framework, security::threat, utils::path_sanitize. Root files become pub-use stubs.
Pure-std IP/CIDR parsing helpers — no platform deps.
Phase 1 of the synapse → amigdala integration plan
(amigdala/docs/synapse-integration.md). Lossy broadcast fan-out from
the FingerprintLogger; new amigdala_bridge module subscribes,
translates each FingerprintLogEntry into amigdala::JaEvent(s), and
pumps them into a caller-supplied Reactor. On a hit, the returned
ReactorAction is forwarded to a user-wired install closure (typically
firewall.block_ip_src_port).
Constraints honoured:
* fingerprints.log path stays untouched. Broadcast happens after the
file send.
* Broadcast is bounded (1024 entries). Slow consumers see Lagged(n)
and skip — never backpressure the writer.
* Optional dep gated behind amigdala-reactor feature; default builds
unchanged.
* JA4T / JA4TS deliberately not emitted via reactor — those are
enforced at the kernel/daemon layer, redundant via this path.
Components:
* src/logger/fingerprint_log.rs — FINGERPRINT_BROADCAST OnceLock,
subscribe_fingerprints() public API, log_fingerprint() fans out.
* src/amigdala_bridge.rs (new, feature-gated) — run() consumer loop,
entries_to_events() converter with 4 unit tests covering kind
dispatch, source 5-tuple carry-through, unset-fields skip, and
invalid-IP handling.
* Cargo.toml — optional amigdala = { path = "../amigdala", features
= ["reactor"] }; new amigdala-reactor feature.
Phase 3 of the synapse → amigdala integration plan. JA4L's c
component — the app-handshake half-RTT — is the meaningful signal
for distinguishing real distant clients from colocated bots. dendrite
already supports `Ja4l::with_app_handshake(...)`; this surfaces it
through synapse's `Ja4lMeasurement` wrapper.
Changes:
* `Ja4lMeasurement` gains optional D / E / F timestamps
(app_hello_time / app_response_time / app_ack_time) plus matching
setters (set_app_hello / set_app_response / set_app_ack).
* `fingerprint_client` / `fingerprint_server` now branch on
availability: 3-segment `{tcp_rtt}_{ttl}_{app_rtt}` when D / E / F
are all set, 2-segment `{tcp_rtt}_{ttl}` otherwise — preserves
back-compat with existing 2-segment consumers.
* New tests: 3-segment emission with reference values from dendrite,
partial-app-handshake falls back to 2-segment.
Capture-site wiring (where in the TLS path D / E / F get recorded)
is the next PR — this commit ships the API only so the call sites
have a stable target to plumb into.
Phase 2 of the synapse → amigdala integration plan. After the XDP skeleton loads, pin the dendrite-provided `ja4_client_events` ringbuf and `ja4_client_events_dropped` per-CPU counter at stable paths under `/sys/fs/bpf/dendrite/` per interface. Out-of-process amigdala consumers (BpfRingbufSource::open_pinned) can then subscribe alongside our existing in-process consumer. Lifecycle: * Pin happens once per attached XDP interface, after the skel loads and the in-process ja4_store consumer is registered. * `remove_file` on the pin path before pinning so a crashed prior run's leftover (libbpf doesn't unpin on its own, fresh pin would fail EEXIST) doesn't block startup. * Pin paths are per-interface (`ja4_client_events_<iface>`) so multiple XDP attachments don't collide. * Best-effort: pin failures log a warning but don't affect the in-process consumer or the firewall path. Pin paths intentionally use the `/sys/fs/bpf/dendrite/` namespace because the BPF program comes from dendrite — synapse loads it but doesn't own its identity. amigdala's docs/synapse-integration.md refers to the same path. This is the throughput story: NFQUEUE caps at ~100k pps, the BPF ringbuf path scales with TLS flow-arrival rate which is multi-Mpps on a single core. Producers and consumers no longer need to share a process.
Phase 4 of the synapse → amigdala integration plan. Adds a thin plumbing layer for pushing JA4T / JA4TS fingerprint patterns into amigdala's userland NDIS enforcement daemon on Windows. Where the patterns come from (config file, gen0sec platform API, threat feed) is intentionally out of scope — synapse provides the wire, operators bring the policy. * `src/amigdala_ndis_feed.rs` (new, gated `cfg(target_os = "windows", feature = "amigdala-ndis")`) — `NdisFeed` shared handle wrapping `Arc<Mutex<NdisFirewall>>`. Exposes block / unblock / allow / unallow for JA4T and JA4TS, plus `replace_ja4t_blocklist(&[String])` for snapshot-style feeds (flush + reinstall in one operation). * `Cargo.toml` — new `amigdala-ndis` feature pulling `amigdala/windows-ndis`. Moved the optional `amigdala` dep out of `[target.'cfg(unix)'.dependencies]` (where it was unreachable on Windows) into the top-level `[dependencies]` so both `amigdala-reactor` (Linux + Windows) and `amigdala-ndis` (Windows-only) can resolve it. Verified with `cargo check --no-default-features --features amigdala-ndis` on the Windows MSVC test host — full build clean. Linux default + `--features amigdala-reactor` builds also clean.
rustc 1.95+ defaults to rust-lld for x86_64-unknown-linux-gnu.
lld is strict about symbol resolution and rejects the openssl-sys
vendored OpenSSL because of an inconsistency in its assembly:
rust-lld: error: undefined symbol: bn_sqrx8x_internal
>>> referenced by x86_64-mont.s:781
>>> libcrypto-lib-x86_64-mont.o
>>> did you mean: bn_sqr8x_internal
>>> defined in: libcrypto-lib-x86_64-mont5.o
The two assembly modules in vendored openssl-sys were compiled with
different BMI2 feature flags — `x86_64-mont.s` calls into the
`bn_sqrx8x_internal` MULX variant but `x86_64-mont5.s` only ships
the non-MULX `bn_sqr8x_internal`. lld reports it; bfd ld resolves
the same set of objects without complaining (older, more lenient
symbol-matching).
`cargo build` linked OK accidentally because of object ordering;
`cargo test --bin synapse` consistently failed. Pinning bfd via
.cargo/config.toml's `[target.x86_64-unknown-linux-gnu]` table
unblocks all test builds without affecting Windows / aarch64
targets.
Verified: 4 amigdala_bridge tests + 3 JA4L tests + 174 other
synapse tests run after the fix.
…code Migrates the fingerprint enforcement code from dendrite's migrate-to-synapse/ into src/security/firewall/fingerprint/: - matcher.rs: userland FingerprintMatcher for JA4/JA4S/JA4H/JA4L/JA4SSH/JA4X/JA4D/JA4D6 - rules.rs: BpfFingerprint, Action, BlockRule - error.rs: error enum - bpf/xdp_filter.bpf.c: kernel XDP filter for JA4T/JA4TS Also adds decode_l3 branch in thalamus_ids_windows so the IDS pipeline can decode raw IP-layer frames (needed for the BLOCK_APPLIED rule path verified on Azure).
Splits synapse into three crates: - synapse-linux: Linux entry point (BPF/XDP, nftables, iptables, daemonize) - synapse-windows: Windows entry point (ETW, service, AF_XDP, NDIS) - synapse (root binary): 19-line dispatcher that calls the per-OS run() Both per-OS crates compile the same logic by including src/app.rs via include!(), giving each its own crate context where crate:: paths resolve correctly without a 138-site mass rename. The root build.rs is gone — each per-OS crate has its own build script for SkeletonBuilder / cc::Build. Cargo.toml gates the per-OS crate as a target dependency so cargo build on Linux pulls only synapse-linux, and on Windows pulls only synapse-windows. Verified on Azure VM: NDIS capture emits JA4T fingerprints, IDS rule blocks port 22 with BLOCK_APPLIED log, traffic flows again after synapse exits.
NdisSubscriber was only forwarding ja4t and ja4ts via the CaptureBackend trait, which only carries those two variants. Bypass CaptureBackend and consume FingerprintEvent directly so JA4, JA4S, JA4L, and JA4LS are also written to fingerprints.log and stored in corpus_callosum. Also normalises flow direction: when only JA4TS is present the event src/dst is server/client; swap them so the log entry is always keyed on (client, server). Verified on Azure: single_source=0 after fix (was 4 before).
… block
Phase 1 of the synapse → amigdala bridge: make the lossy
`subscribe_fingerprints()` broadcast (already shipped) actually
drive a kernel firewall on `Reactor` hits.
* `amigdala_bridge::spawn_with_iptables` / `spawn_with_nftables`
build an amigdala firewall with `with_ja4(...)` (which spawns
the NFQUEUE daemon and exposes a `BlockedJa4Sets` /
`Reactor`), wrap it in `Arc<Mutex<...>>`, and spawn the bridge
task with an apply closure that calls `block_ip` /
`block_ip_src_port` on each `ReactorAction`. Generic over
`FirewallBackend` so both backends share one task body.
* main.rs: spawn the iptables-backed bridge alongside the
Linux `FingerprintWriter`, gated on `feature =
"amigdala-reactor"`. Init failure logs and continues — we
don't take down synapse if iptables happens to be missing.
* Cargo.toml: amigdala dep gains the `ja4` feature so
`with_ja4(...)` and `Reactor` are in scope.
End-to-end shape after this commit:
uprobe / wire capture
→ FingerprintLogEntry
→ log_fingerprint() (file write + broadcast)
→ amigdala_bridge::run() (subscriber)
→ entries_to_events() (`JaEvent` per kind)
→ reactor.consume() (returns `ReactorAction`)
→ apply closure (locks `Arc<Mutex<Firewall>>`)
→ kernel rule install (iptables chain)
Other JA4-family kinds (JA4 / JA4S / JA4L / JA4LS / JA4SSH /
JA4H / JA4X) now have a working kernel-enforcement path on
the host where synapse runs the capture: the writer broadcasts
→ bridge converts → reactor matches against the configured
blocked set → SYNs from the offending source IP get dropped at
the netfilter layer. JA4T / JA4TS stay on the inline kernel-
match path (iptables / nftables decompose into kernel match
clauses, dendrite XDP for amigdala v0.10's xdp backend) — the
bridge intentionally skips those kinds in `entries_to_events`.
Bridge unit tests (4/4) pass; a full live test belongs in
amigdala's external matrix harness (next change there).
Three changes that together make synapse drive amigdala's XDP firewall instead of iptables, and let an external test harness install pre-blocks without exposing a control plane: * `amigdala_bridge::spawn_with_xdp(iface, shutdown)` — symmetric to `spawn_with_iptables`, but builds an `amigdala::firewall:: xdp::XdpFirewall` (hillock TC + dendrite XDP coexisting on the same iface) and feeds the Reactor's `BlockIp` actions into hillock's IpFilterManager. JA4 family kinds whose capture is uprobe-driven (JA4H, JA4X) or post-decrypt (JA4, JA4S) all flow to *kernel* TC drops instead of stopping at userland. * `entries_to_events` now emits `Ja4ts` events. The kind was excluded by an outdated comment claiming "kernel-inline only" — now that amigdala routes Ja4ts through the userland set, the bridge has to forward it. * `preblock_from_env`: parse `AMIGDALA_PREBLOCK=kind=value,...` on startup and call `block_fingerprint` for each entry. Used by amigdala's `test-xdp-all-ja4.sh` to install a known-bad fingerprint before driving traffic. * main.rs: spawn the XDP-backed bridge alongside the Linux FingerprintWriter; iface comes from `config.network.iface` (falls back to enp35s0). Init failure logs and continues. * `Cargo.toml`: amigdala dep gains `xdp` feature; `[patch]` redirect to local `/root/dendrite` so source-level changes to the dendrite collectors actually land in the build (the prior git rev meant nothing I edited was reaching the BPF skel). Verified by amigdala's `tools/test-xdp-all-ja4.sh` — every JA4 family kind drops real cross-machine SYNs at the kernel after this commit. Wire trace included; per-backend run signature differs (TC vs XDP, post-handshake vs SYN-time) and the verdict logic in the harness accepts both shapes.
…r-node engine
Heavy signature IDS no longer has to run inside every scaled proxy
replica; ruleset cost becomes O(nodes) instead of O(proxy replicas).
P0 (no-behaviour-change foundation):
- synapse-events: `HttpTxn` — the serde producer<->consumer boundary
(mirrors `SignatureEngine::inspect_http` inputs + content_type).
- synapse-proxy/thalamus_post_tls: split into pure `build_http_txn()`
+ `inspect_txn()`; inline composes them (byte-identical to legacy).
- synapse-core: `IdsConfig.post_tls { mode, socket }`, `#[serde(default)]`
so an absent key = `inline` = exact legacy behaviour.
P1 (offload transport + per-node consumer):
- synapse-eventbridge/ids_l7: framed UDS transport — a bounded,
non-blocking, drop-on-backpressure `L7Sink` (proxy) and
`start_l7_ingest_server` (agent). Frame `[u32 LE len][serde_json
HttpTxn]`; the server chmods the socket 0666 so a non-root proxy can
connect (node-local IPC; shared-gid tightening is a later refinement).
- thalamus_post_tls: `offload` builds the sink ONLY (no engine/ruleset
in the proxy); `is_active()` dispatch — inline inspects (can block),
offload fire-and-forgets the txn and returns None (never blocks the
request path on IDS).
- synapse-idp/start_l7_ingest: ONE per-node `SignatureEngine`
consuming the UDS; wired in synapse-app for agent + ids.enabled +
post_tls.mode==offload.
Verified in k3s: proxy non-root with zero ruleset even at 2 replicas;
agent runs one node-wide engine; a sqlmap-UA request through the
ruleset-less proxy is detected out-of-band ("ET SCAN Sqlmap SQL
Injection Scan", sid 2008538); WAF suite 26/26 unaffected. Canonical
`cargo check --features proxy` + clippy (new crates) + synapse-events
tests green.
offload is detect/alert only (no inline block — by design, IDS must
never add latency/failure to serving). L7-alert telemetry/enforcer
parity and scale/throughput are P2/P3.
The decoupled L7-IDS consumer now treats detections exactly like the XDP IDS worker, so existing OTLP/eventbridge consumers and kernel enforcement need no L7-specific handling: - Telemetry/eventbridge parity: emit via the same `synapse_blocking_log::Builder` path the XDP IDS uses (BlockLayer::Ids / RateLimit; BlockAction::Notice for observation, Block/Ratelimit when enforced). L7 detections now appear as `synapse.block.events.ids` (service.name=synapse, layer=ids) identically to XDP-IDS alerts. - Enforcement parity: a block/drop/reject action under `enforce_block` calls `synapse_security::runtime_blocklist::block_ip_runtime(src_ip)` — the same runtime blocklist the agent's XDP firewall consults — so future packets from the offender drop at the NIC. Verified in k3s (16GB/8vCPU VM): dual full-ET engines (XDP-IDS + L7-ingest, 49604 rules each) up in ~48s, agent Running/0-restarts (no OOM); sqlmap UA via the offload proxy and a /__l7drop request both surfaced at the otel-collector as synapse.block.events.ids (action notice and block); enforce path logged `action=block blocked=true`; proxy still non-root with zero ruleset; WAF 26/26. Wire-level packet drop verification remains the documented XDP-east-west topology caveat (out of scope); P2 proves the code-path + telemetry/enforce parity. P3 = scale/throughput + optional dedicated IDS tier.
…al-unicast src IPs P3 surfaced a remotely-triggerable DoS in the decoupled post-TLS IDS: in `offload` the L7 engine's `src_ip` is the *proxy-observed* client, which behind NAT / LB / CDN / kube-proxy is routinely a shared infra address. P2's enforce-parity then `block_ip_runtime`'d it, so one `/__l7drop` request banned the shared k3s/SLIRP ingress IP and blackholed every client behind it (WAF 9/25). The parity code was correct; the offload input is the untrustworthy part. Refinement: - New `synapse_security::runtime_blocklist::is_ban_safe_ip` — the single source of truth for "may this IP be pushed into a kernel firewall". Global-unicast only; rejects loopback / RFC1918 / CGNAT (100.64.0.0/10) / link-local / broadcast / documentation / unspecified / multicast / IPv6 ULA & link-local. Fails closed. - New `PostTlsIdsConfig.enforce` (`#[serde(default)]` = false), deliberately distinct from `ids.enforce_block`: `offload` now defaults to pure detect/alert regardless of `enforce_block`. IP enforcement is an explicit opt-in. - `thalamus_l7_ingest`: enforcement keys off `post_tls.enforce`, and even when enabled every ban is hard-gated by `is_ban_safe_ip`. A shared / non-global-unicast src is logged and downgraded to detect-only (telemetry still emits, as Notice). - `waf_fw_offload::is_safe_to_block` now delegates to the shared helper (one implementation; also gains the CGNAT coverage it previously lacked). The own-listener exemption stays proxy-local. Gates: `cargo check --features proxy` ✓; `-p synapse-app --features proxy,amygdala-reactor` ✓; `cargo test -p synapse-security runtime_blocklist::` 2/2 ✓. Verified in k3s with the adversarial config (`post_tls.enforce:true` + the `/__l7drop` drop rule — the exact P3-blackhole setup): alert fires `blocked=false`, guard refuses banning `10.42.0.1`, runtime blocklist gains 0 shared IPs, otel still receives `sid:9000009` as `synapse.action=notice`, proxy still non-root/zero-ruleset, and WAF 26/26 — ingress is no longer taken down.
The [patch.gen0sec] thalamus = { path = "../thalamus" } line was left
uncommented on this branch (amygdala/cortex/dendrite are all commented
per the section's own guidance: workspace deps resolve from the
gen0sec registry; uncomment only for local source-level edits).
It pinned the build to a local ../thalamus checkout that is 1 commit
ahead of thalamus origin/main (ca275a5 'perf(rules): force ContiguousNFA
Aho-Corasick prefilter' — perf-only, unreleased, no API change), so
the branch did not build standalone / in CI. Comment it out; Cargo.lock
re-resolves thalamus 0.0.4 from the gen0sec registry. cargo build
--features proxy,bpf --bin synapse: clean against the registry crate.
…ert set)
With acme.enabled=false the TLS listener was bound to a one-time
startup snapshot of the certificate set: start.rs built
`TlsSettings::with_callbacks(Box::new((*certs).clone()))` once, and
the inotify cert watcher rebuilt `Certificates` into the
`certificates_arc` ArcSwap (start.rs:430) that the live listener's
TlsAccept never consulted. Result: a cert added or rotated in the
certificates dir after boot was invisible until a process restart
(only certs present at startup, and the default, were ever served).
- tls.rs: extract the SNI cert-selection body of
`impl TlsAccept for Certificates::certificate_callback` into an
inherent `Certificates::apply_to_ssl(&self, &mut SslRef)` so a
dynamic resolver can delegate to the CURRENT set; the trait impl
now just calls it. Add `tls_settings_from_accept(Box<dyn
TlsAccept>, grade)` factoring the grade/ALPN wiring so any
resolver can be supplied without duplicating it
(create_tls_settings_with_sni now calls it too — behavior
unchanged).
- start.rs: new `DynamicCertificates { store, fallback }` TlsAccept
that, per handshake, `load_full()`s the live `certificates_arc`
ArcSwap and delegates to that `Certificates::apply_to_ssl`
(falling back to the startup set only if the store is somehow
empty — never after init). The TLS listener is built from it
instead of a snapshot clone. Cheap, lock-free, in-flight
handshakes unaffected, behaviorally identical when the set is
unchanged.
`watch_folder`'s event filter is deliberately left unchanged: it
already re-scans on Create / Modify(Data) / Remove, which is exactly
what an in-place cert write (truncate+write, no rename — the pattern
synapse-operator uses) produces. Atomic tmp+rename would land as
Modify(Name) and still be missed, but no first-party writer does that;
that broadening is intentionally out of scope here.
No new dependencies (arc-swap/async-trait already present in
synapse-proxy). cargo build --features proxy,bpf --bin synapse: clean.
k3s e2e (synapse-operator projecting Ingress/Gateway TLS Secrets into
an operator-owned certs dir): 12/12 — distinct cert served per SNI for
domains added AFTER synapse booted; cert rotation served immediately
with the SAME pod (no restart, no SIGHUP); pruning a cert falls SNI
back to the configured default; default + public domain unaffected.
WAF-branch commit 3f10782 added a pinned-map reconcile (synapse-access-rules: snapshot_kernel_banned_rules) that calls SYNAPSEFirewall::banned_ipv4_entries()/banned_ipv6_entries(). Those inherent accessors exist only on the BPF firewall (firewall/mod.rs); the noop SYNAPSEFirewall (firewall_noop.rs, selected for not(all(unix, feature="bpf")) — Windows / --no-default-features) never got them, so the Windows release-windows --no-default-features build failed: E0599 no method named banned_ipv4_entries. Pre-existing on this branch; unrelated to the thalamus patch hygiene commit (the registry thalamus 0.0.4 compiled fine). Add empty-Vec stubs mirroring the real signatures (no kernel banned_ips map without BPF ⇒ empty reconcile baseline), matching how firewall_noop no-ops every other firewall method. Verified: cargo check -p synapse-access-rules --no-default-features is clean.
The cert-fix import additions in start.rs weren't rustfmt-canonical (import ordering / cfg-attr placement at lines 13/22/28), failing the Formatting job and the Windows job's 'Check formatting' step. nightly rustfmt --edition 2024, this file only; no logic change.
Pre-existing on feature/waf-response-phase (WAF WIP; unrelated to the
thalamus patch-hygiene, the live-cert-reload merge, or the noop
firewall stub). cargo clippy --workspace -- -D warnings and cargo doc
-- -D warnings gate these:
- synapse-eventbridge/src/ids_l7.rs:58 needless_return → tail expr
- synapse-eventbridge/src/ids_l7.rs:76 manual_is_multiple_of
n % 1000 == 0 → n.is_multiple_of(1000)
- synapse-waf/src/actions/penalty_sync.rs:43 public doc intra-link
to private DEFAULT_CHANNEL → plain code span (no link)
Tool-suggested, behaviour-preserving. Verified locally:
clippy -p synapse-eventbridge --all-targets -- -D warnings clean;
RUSTDOCFLAGS=-D warnings cargo doc -p synapse-waf clean; fmt clean.
…ound 2)
Pre-existing WAF-branch lint debt under cargo clippy --workspace
--all-targets -- -D warnings (and the Windows clippy step). Unrelated
to thalamus / live-cert-reload / noop-firewall.
- synapse-eventbridge/src/ids_l7.rs: FRAME_MAX/SINK_CAP are used
only in the #[cfg(unix)] sink path; on non-unix (Windows) they
were dead_code → -D warnings error. Gate both consts #[cfg(unix)].
- synapse-waf/src/actions/penalty_box.rs, synapse-waf/src/
wirefilter.rs, synapse-security/src/runtime_blocklist.rs: new test
modules use .unwrap() (clippy.toml disallowed-methods). Apply the
repo's established #[allow(clippy::disallowed_methods)] convention
(10+ such sites on release/v0.7.0) at the #[cfg(test)] mod level.
Verified locally: cargo clippy --locked --workspace --all-targets --
-D warnings is CLEAN; cargo fmt --all -- --check clean.
Two related fixes from the gcp-nlp-l4 demo memory-reduction pass:
1. BPF map sizes in synapse-security/firewall/bpf:
- xdp_afxdp_tail.bpf.c, xdp_maps.h: tighter per-CPU + per-flow maps;
NO_PREALLOC where shape allows; bounded LRU caps for synapse-side
allowlists/denylists.
- bpf_utils.rs / bpf_utils_noop.rs: matching userspace knobs and the
reference-count bookkeeping that pinned maps need to survive a
pod restart cleanly. Pinned-map staleness across rebuilds was the
bug that initially masked these savings.
2. network.iface: "auto" now selects ONLY physical UP uplink interface(s).
Loopback and CNI/virtual/tunnel devices (veth*, lxc*, cilium*, gke*,
docker*, br-*, vxlan*, …) are excluded; bond* and VLAN sub-interfaces
(eth0.100) are kept. The previous "auto" behaviour attached XDP +
capture to *every* up interface on a node, which exploded memory on
Cilium-equipped GKE nodes (dozens of lxc*/cilium_* veths each get
their own XDP program copy + their own per-CPU JA4 maps) and broke
Cilium's datapath. With the filter, the agent attaches only to the
node uplink, keeping memory at the ~30-50 MB baseline.
Cargo.toml: uncomment the dendrite path overrides so the matching
dendrite-side map shrinks reach this build. Drop without dendrite branch
`feat/bpf-map-shrink` checked out under ../dendrite.
docs/CONFIGURATION.md + docs/KUBERNETES.md: document the new auto-filter
semantics and the "synapse refuses to clobber a foreign XDP program"
safety. .gitignore: ignore *_token files used during local docker builds.
… UDS bridge
Three independent demo patches that unlock the WAF feature surface
behind GCP Global LB and on the edge-passthrough path. All shipped
and verified end-to-end against gcp-nlp-l4 (test-a harness PASS 37/0,
edge UDS bridge 30 FpObserved/5s observed). Empty defaults preserve
current behaviour; each is opt-in via config.
--- A. effective_client_ip — XFF-aware per-real-client rate-limit & ip.src
`rate_limit::check_rate_limit` keys on the L4 socket peer IP. Behind
any L7 LB that string is the LB IP, so per-IP counters never
accumulate per real client. Patch:
- synapse-utils/src/xff.rs (new) — effective_client_ip(peer, headers,
trusted_proxies) -> IpAddr. Returns first XFF token when peer is in
the trusted CIDR list; else returns peer (spoofing-defence). 9 unit
tests cover empty/untrusted/valid/malformed/IPv6/multi-range/loopback.
- synapse-core/src/core/cli.rs — ProxyConfig.trusted_proxies: Vec<String>
(CIDR strings, default empty).
- synapse-utils/src/structs.rs — AppConfig.trusted_proxies: Vec<(IpAddr,u8)>
parsed once at config load via parse_ip_or_cidr; warns + skips bad
entries.
- synapse-proxy/src/proxyhttp.rs — three call-site swaps: WAF context
build (signal.ip.src), request-phase ratelimit bucket key,
response-phase ratelimit bucket key. invoke_smart_firewall_block
retains socket_addr.ip() — kernel drops must target real L4 peer.
--- B. trust_lb_tls_headers — populate signal.ja4 from x-client-tls-ja4
When a TLS-terminating LB injects the JA4 in a request header,
pipe that header into the WAF wirefilter context so the existing
signal.ja4-keyed rules work behind LBs that decrypt upstream.
- synapse-core/src/core/cli.rs — ProxyConfig.trust_lb_tls_headers: bool
(default false). Deployment-shape gate — only flip on when fronted
by a known LB that owns the header; on a direct-attached proxy a
client could spoof the header.
- synapse-waf/src/wirefilter.rs — TRUST_LB_TLS_HEADERS: OnceLock<bool>
+ set_trust_lb_tls_headers() setter. populate_request_context now
reads `x-client-tls-ja4` into signal.ja4 when the flag is on,
falling back to empty string. 43 existing WAF tests still pass.
- synapse-app/src/lib.rs — startup wire-up calls the setter from
config.proxy.trust_lb_tls_headers.
--- C. fp_uds_subscriber — cross-process FpObserved bridge over UDS
Lets a co-located synapse-agent (XDP, captures JA4 etc.) ship its
observations to a TLS-terminating synapse-proxy that has no XDP of
its own. Most plumbing already existed in synapse-eventbridge —
send_fp_event() had a FP_SOCKET_CLIENTS broadcast path but no
listener was registering with it. Two-line publisher fix + a new
subscriber module:
- synapse-eventbridge/src/event_server.rs — set_fp_socket_clients()
call so the existing event server fans out FpObserved alongside
HTTP and Packet.
- synapse-eventbridge/src/fp_uds_subscriber.rs (new) — proxy-side
client: connect to UDS, read JSON-line SocketEvent::Fp envelopes,
republish onto the local bus via send_fp_event. Reconnects on
EPIPE / connect failure with 2 s backoff. Unix-only (Windows stub).
2 unit tests: end-to-end pump + missing-socket retry safety.
- synapse-eventbridge/src/lib.rs — pub mod fp_uds_subscriber + a
pub use synapse_events::{FpObserved, FpSource} re-export.
- synapse-eventbridge/Cargo.toml — tempfile dev-dep for the test.
- synapse-core/src/core/cli.rs — ProxyConfig.fp_event_bridge: { socket_path:
String } (empty = subscriber disabled, default).
- synapse-app/src/lib.rs — startup spawns the subscriber thread when
the socket_path is non-empty AND is_agent_mode is false, with a
shutdown atomic tied to the main shutdown_rx watch.
- synapse-security/src/utils/fingerprint/tcp_fingerprint.rs — the BPF
ClientHello-store path previously emitted nothing on the FpObserved
bus (only synapse-app::kernel_pump did, and only for PacketEvent).
Surfaced during edge-cluster deployment as a silent bridge. Added
the missing send_fp_event() call right after the existing
"JA4: stored eBPF-captured ClientHello fingerprint" log line so
BPF-sourced captures are visible to cross-process subscribers.
…pin reuse When BPF map shapes change between releases (e.g. the AFXDP_FLOW_DENY_MAX shrink from 1048576 → 65536 in commit 7334be2), libbpf reuse-by-name fails OpenSkel::load() with EINVAL because kernel 6.12+ strictly validates pinned-map properties (max_entries, map_flags, value_size) on reuse. The agent then silently falls back to the XDP/BPF-only backend, AFXDP_FW_HANDLE stays None, and `synapse_smart_firewall::ja4_reload::apply_ja4_kernel_blocks` no-ops — kind: ja4 entries in smart_firewall_rules.block never reach the kernel. Surgical EINVAL recovery on `.load()`: - synapse-security/src/utils/bpf_utils.rs (+38 LOC): `unpin_stale_pinned_maps(pin_root, names)` — best-effort remove_file for each name under pin_root; silent on NotFound, WARN on other I/O errors. Surgical name list (not remove_dir_all) is required because pin_root may host other pins that the same process owns and reuses (e.g. xdp_link_<iface>). - synapse-app/src/lib.rs (refactor load_afxdp_tail_program): factor the builder→open→load triple into a closure so the retry can re-leak a fresh MaybeUninit<OpenObject> (libbpf consumes it per attempt). On EINVAL (string-match os error 22 or Invalid argument — libbpf-rs surfaces no structured kind), invoke the helper with TAIL_PIN_NAMES = [xsks_map, flow_deny, afxdp_intercept_stats] (the three LIBBPF_PIN_BY_NAME maps that xdp_afxdp_tail.bpf.c declares) and retry once. Any non-EINVAL error or a second EINVAL bubbles to the caller. Mirrors the B.4 retry shape in xdp_pipeline (lib.rs:1226-1260) but diverges in cleanup mechanism: B.4 uses remove_dir_all, which is safe THERE because B.4 runs before xdp_link_<iface> is pinned; the tail loader runs AFTER, so a directory-wide wipe would orphan the live XDP attachment. Unifying both call sites is a deferred follow-up. Verified live on the gcp-nlp-l4 edge cluster: pre-fix, the agent logged `afxdp tail program load failed: ... (os error 22)`; after manually removing the stale /sys/fs/bpf/synapse/firewall/flow_deny pin AND enabling firewall.amygdala.enabled=true in config-agent.yaml, agent logs `amygdala_bridge started (afxdp backend, iface=eth0)`, `ja4 kernel blocks reloaded: +6 / -0 (total now 6)`. With this commit, future map-shape changes self-heal without operator `rm` intervention.
…les loaded BPF tail program (xdp_afxdp_tail.bpf.c) used to unconditionally redirect every TLS ClientHello to the userspace amygdala worker for JA4 inspection. When no `kind: ja4` rules are loaded, BlockedJa4 is empty, so the worker has no DROP verdict to issue; its default "allow" path bounces the frame back out via AF_XDP TX. On a host-firewall topology (the packet's dst IP is THIS host), that ships the frame back to the upstream gateway, which black-holes it. Net result: every external TLS handshake silently times out whenever the AF_XDP backend is running but the rule set is empty. Verified live on the gcp-nlp-l4 edge cluster (a.g0s.dev): curl from external Linux clients to :443 timed out for hours, while in-cluster TLS to the same node:443 succeeded; scaling synapse-agent to 0 replicas immediately restored end-to-end connectivity. The agent BPF stack was the sole source of the drop, despite UC4/UC7 rule maps being empty. Fix — a runtime kill-switch in the BPF tail: - New `afxdp_intercept_cfg` ARRAY map (1 entry, pinned-by-name). Key 0 holds a u32 `enabled` flag. - Tail program reads the flag AFTER the flow_deny LRU check (so residual blocked flows still drop during rule removal) and BEFORE the AF_XDP redirect. If the lookup misses OR returns 0, return XDP_PASS — packets flow into the local kernel stack unmolested. - Default state: map empty → BPF lookup returns NULL → treated as enabled=0 → pass. Fresh agent boot is fail-safe. Userspace plumbing (3 files): - synapse-utils/src/hooks.rs (+35 LOC): new hook pair `set_afxdp_intercept_enabled_hook` + `invoke_afxdp_intercept_enabled(bool)` mirroring the existing ja4_kernel_block_reload hook pattern. - synapse-smart-firewall/src/ja4_reload.rs: after each BlockedJa4 diff applies, calls `invoke_afxdp_intercept_enabled(!wanted.is_empty())` so the flag tracks rule-set non-emptiness on every config reload. Plus +1 dep in Cargo.toml (synapse-utils). - synapse-app/src/lib.rs (+37 LOC): registers the hook handler. Opens `/sys/fs/bpf/synapse/firewall/afxdp_intercept_cfg` via libbpf_rs::MapHandle::from_pinned_path and updates key 0. Quietly skips when the pin isn't available yet (early startup, before the AF_XDP tail program has loaded). Fail-safe properties: - Cold boot (no rules) → traffic flows. - Tail program not loaded → hook no-ops cleanly. - AF_XDP backend not built → no hook registered, no-op. - Rule removal mid-flight → flow_deny check still drops residual blocked flows (it runs BEFORE the kill-switch). - Stale enabled=1 after worker death → bpf_redirect_map falls back to its third-arg XDP_PASS when no AF_XDP socket is bound, so the redirect is safe in the absence of a userspace consumer. Follow-up: same bug exists in amygdala's standalone-attach variant (amygdala/src/bpf/afxdp_intercept/intercept.bpf.c — used when afxdp_attach_mode != "shared"). Will land as a parallel commit in amygdala.
After agent restart, the pinned `afxdp_intercept_cfg` BPF map
survives but the in-memory `previous` HashSet in apply_ja4_kernel_blocks
is fresh-empty. If the new YAML has no `kind: ja4` entries, the diff
comes out empty (wanted={} = previous={}) and the function returns
early WITHOUT updating the kill-switch flag — leaving a stale
enabled=1 from the previous generation. The BPF tail then redirects
ClientHellos to a worker whose BlockedJa4 set is also fresh-empty,
so every TLS handshake stalls.
Resync the kill-switch unconditionally before the early-return. The
map update is a single 4-byte syscall and idempotent — cheap to
always run.
…d-proxy topologies When a synapse-proxy sits behind another synapse-proxy with a GFE/L7 LB in between (client → edge synapse → Tier-2 LB → Tier-2 synapse), the Tier-2 LB injects X-Client-Tls-Ja4 carrying the EDGE synapse's JA4 (it terminated TLS with the LB), not the original client's. Patch B as shipped read X-Client-Tls-Ja4 unconditionally, so signal.ja4 at Tier-2 came out as the edge's JA4 — worthless for client identification. Edge synapse-proxy already forwards the real client's JA4 via X-JA4 / X-JA4-Raw headers when forward_fingerprints=true (set in proxyhttp.rs:1733). Prefer those upstream-synapse-forwarded headers over the LB-injected one. Both still gated by the trust_lb_tls_headers flag so directly-exposed proxies can't be spoofed. Also populate signal.ja4_raw_unsorted from X-JA4-Raw — was always empty before.
The captcha-client init block (init_captcha_client +
start_cache_cleanup_task) lived inside the api_configured
branch. LOCAL MODE deployments — proxy.captcha.{site_key,
secret_key,jwt_secret} populated from a YAML file with no
API key — therefore had Turnstile/hCaptcha keys loaded but
the client never wired up: startup_complete logged
`captcha_client: false`, and any WAF rule with action:
captcha failed at request time.
Move the init below the if/else so it runs for both modes.
WafAction::Challenge was passing peer_addr.ip() to validate_captcha_token + generate_captcha_token. Behind a GFE LB (or any front-end that rotates peers) the L4 peer changes per request, so the JWT issued on attempt 1 fails the ip_address check on attempt 2 — captcha loops forever even when Turnstile solving succeeds and the token is marked validated in Redis. Use synapse_utils::xff::effective_client_ip against self.config.trusted_proxies (same pattern Patch A uses for signal.ip.src). The XFF first-hop is stable across GFE/NLB rebalancing while still spoof-resistant for direct clients.
Three layered hardenings on the captcha bearer cookie: 1. JA4 binding. `CaptchaClaims.ja4_fingerprint` (already part of the struct, previously always None) is now populated at issue time from `_ctx.tls_fingerprint.ja4` with `X-Client-Tls-Ja4` fallback for LB-terminated TLS. `validate_token` compares the claim against the request-time JA4 and rejects on mismatch. Defeats replay-from-same-NAT-with-spoofed-UA attackers who can mimic IP+UA but not the victim TLS stack signature. 2. Cookie Max-Age now tracks the JWT exp via the new `captcha_token_ttl_seconds()` helper, instead of hardcoding 3600. Drops the default TTL from 2h (7200) to 10min (600) in the demo config — shrinks the replay window without breaking typical sessions. 3. Opt-in one-shot mode via `proxy.captcha.one_shot: true`. After a successful cookie-bypass, `revoke_captcha_token` is called on the JTI, blacklisting it in Redis. The next request through the same `challenge`-action rule re-challenges. For sensitive paths only; off by default so normal session UX is unchanged.
Global proxy.captcha.one_shot is a single switch; ops need finer control to keep normal session UX on most challenge-action rules while making specific sensitive paths (password reset, payment confirm) revoke-after-first-bypass. WafResult grows a `one_shot: bool` field, parsed from the rule's `config.oneShot` JSON at compile time (only honored when action is Challenge; ignored elsewhere). proxyhttp.rs combines per-rule + global flags via OR so either knob can turn it on.
…imit field
Per-rule Challenge configs without a rateLimit field (e.g. just
{ "oneShot": true }) used to spam ERROR-level "rateLimit field not
found" logs at compile time. The rule still worked but the log
was noise. Pre-filter for the field before invoking from_json.
Mechanical cleanup so this branch is wellness-check-clean before
opening the PR:
* cargo fmt --all — reformats 8 files touched earlier in the
session (xff helpers, fp_uds_subscriber, AF_XDP killswitch,
captcha JA4 binding, etc.). Whitespace + trailing-comma
canonicalisation only; no behaviour change.
* Replace two bare `.unwrap()` calls in test code that the
repo bans via `clippy::disallowed_methods`:
- synapse-eventbridge/src/fp_uds_subscriber.rs (test
publisher) — `.unwrap()` on a hard-coded `1.2.3.4`
IpAddr parse → `.expect("hard-coded literal IPv4
parses")`.
- synapse-utils/src/xff.rs (test fixture builder) —
`.unwrap()` on `HeaderValue::from_str` of a test XFF
literal → multi-line `.expect("test XFF literal must
be valid header value")`.
* Add the missing `one_shot: false` field to five `WafResult
{ … }` struct literals in
synapse-worker/src/access_log.rs — a fall-out from the
earlier `feat(waf): per-rule one_shot opt-in via
rule.config.oneShot` commit which extended the struct shape
but missed these five test-only construction sites.
Verified locally on Linux with CI-exact flags:
cargo fmt -- --check OK
cargo clippy --locked --workspace --all-targets -D warnings OK
cargo test --locked --workspace --lib OK
(33 tests in synapse-worker — including the 5 access_log
sites — all green)
Windows leg is covered by the matrix job (.github/workflows/
windows-build.yaml) once the branch is pushed.
…ling checkout
The `[patch.gen0sec]` block had four `dendrite{,-core,-linux,-windows}
= { path = "../dendrite/crates/<name>" }` entries left uncommented from
local source-edit work. CI runners check out only the synapse repo,
not the dendrite sibling, so cargo fails on PR #324 with:
error: failed to load source for dependency `dendrite`
Caused by: unable to update D:\a\synapse\dendrite\crates\dendrite
Caused by: failed to read `…\dendrite\crates\dendrite\Cargo.toml`
Caused by: The system cannot find the path specified.
(visible on the Windows MSI build job — step 7 `Build static exe
(no-default-features)`, run 26232111796). Same failure mode would
hit the Linux build/test jobs once they reached the cargo invocation.
Comment all four lines back out so the registry-version dendrite
0.1.2 resolves cleanly. Cargo.lock is refreshed at the same time to
drop any path-source references and pin firmly to the registry
revision. Local dev that needs the source-level dendrite edits
uncomments these lines temporarily — the comment block now spells
out the gotcha explicitly so future runs do not repeat it.
Verified locally with CI-exact flags after the change:
cargo check --workspace OK
cargo clippy --locked --workspace --all-targets -D warnings OK
…ilds
The HashMap is populated + drained inside `#[cfg(all(unix, feature
= "bpf"))]` blocks. On the Linux runner those blocks compile so the
variable is live. On the Windows MSI / windows-build.yaml job the
cfg gate elides every reader, leaving the declaration unused →
`-D warnings` trips the windows-build clippy step at lib.rs:918.
Match the established pattern a few lines further down where
`bpf_startup_ok` (same shape: declared unconditionally, read only
under the cfg gate) carries `#[allow(unused_mut, unused_variables)]`.
Cheaper than wrapping the declaration in another cfg block — the
binding is needed when the cfg is on, and the allow keeps the diff
narrow.
Verified locally (Linux):
cargo fmt -- --check OK
cargo clippy --locked --workspace --all-targets -D warnings OK
Windows fix verified at next CI run.
The multi_variant_subscriber_writes_distinct_event_types test spawns
a tokio task that holds a `BufWriter` against a real on-disk file
and depends on the `tokio::time::interval(100ms)` flush tick firing
before the test reads the file back. Under `cargo miri test
--no-default-features` (the wellness-check.yaml miri job, run on
synapse-core / -events / -eventbridge / -log) the runtime/timer
combination + Miri's lack of faithful FS modelling means the
BufWriter is never flushed before the assertion runs, so the test
sees `lines == []` and fails:
thread '…multi_variant_subscriber_writes_distinct_event_types'
panicked at crates/synapse-core/src/logger/eventbridge_log.rs:185:9:
expected a fingerprint line, got: []
Apply the same `#[cfg_attr(miri, ignore)]` gate the repo already
uses in 8 sites (`utils::http_client` for FFI-touching tests,
`core::cli` for serial_test-using tests). The test still runs as
part of the regular `cargo test --workspace --lib` job (verified
locally: passes in 0.25s).
The unit-test coverage of the broadcast → file pipeline is not lost
— `cargo test` covers the real path. Miri's value is on the pure
in-memory logic in this same module set (e.g. WAF rule plumbing,
JA4 parsers), which still runs.
…under miri
Second wave of the same class of Miri incompatibility — the first
thing `start_fp_uds_subscriber`'s background thread does is
`UnixStream::connect(&socket_path)`, which calls
`libc::socket(AF_UNIX, SOCK_STREAM, 0)`. Miri models only AF_INET
and AF_INET6 sockets and aborts:
error: unsupported operation: socket: domain 0x1 is unsupported,
only AF_INET and AF_INET6 are allowed.
at fp_uds_subscriber.rs:88 (run_loop)
spawned from fp_uds_subscriber.rs:67 (start_fp_uds_subscriber)
thread name: fp-uds-sub
Apply the same `#[cfg_attr(miri, ignore)]` gate, matching the
convention now spanning four files in this PR (eventbridge_log,
utils::http_client, core::cli, and now fp_uds_subscriber).
The UDS bridge still has coverage via:
* `cargo test --workspace --lib` (wellness-check unit-tests job)
* The e2e-root job (real socket end-to-end with `socat`)
* Live verification in the gcp-nlp-l4 demo cluster
(30 envelopes/5 s observed against the agent socket)
The remaining Miri scope still exercises the pure-Rust eventbridge
broadcast plumbing + the JSON decode path in `run_loop`'s body
when called directly without the AF_UNIX dependency.
Picks up `fix(afxdp): kill-switch for standalone intercept program — parity with synapse tail` (gen0sec/amygdala#5, published as gen0sec/amygdala@v0.1.1 on 2026-05-21). Pairs with the synapse-side AF_XDP kill-switch commits already on this branch (2b3d329, a94ae0f, 7bff008): synapse owns the `xdp_afxdp_tail` program, amygdala owns the standalone `AttachedIntercept` program, both now `XDP_PASS` when no JA4 rules are loaded so the kernel intercept does not stall traffic when the userspace consumer has nothing to drain. Touched files: * crates/synapse-idp/Cargo.toml (Windows / NDIS path) * crates/synapse-smart-firewall/Cargo.toml (Linux + Windows) * crates/synapse-security/Cargo.toml (Reactor + JA4) * Cargo.toml (workspace-deps comment) * Cargo.lock (registry pin + checksum) Verified locally with CI-exact flags: cargo fmt -- --check OK cargo clippy --locked --workspace --all-targets -D warnings OK cargo test --locked --workspace --lib OK
This pub fn had zero callers anywhere in the workspace. The L3/L4
enforcement path (apply_rules / apply_rules_nftables /
apply_rules_iptables) only reads rule.block.{ips,country,asn} and
never consults rule.allow.*, so the helper was unreachable from
the kernel ban path.
Note: access_rules.allow remains a config schema field used by
the TUI event bus (synapse-app, terminal_client) for display
transport only. No L3/L4 allow-precedence semantics are implied.
Operators who need allow-precedence should rely on proxy-mode
WAF rules instead.
…es + iptables)
Before this change the agent topology silently accepted
access_rules.allow.* entries in config but never enforced them. The
XDP program, nftables chain, and iptables chain only consulted
access_rules.block.*; an IP in only the allow list got no protection
from a runtime ban, and an IP in both allow and block was blocked.
Operators expecting allow-precedence (e.g. "always let our corp VPN
through even if threat-intel flags an IP in that range") had no way
to express it in agent mode — only the proxy / WAF topology had a
working allow path (wirefilter L7 rules).
This wires allow as a true override at every kernel-side layer:
XDP path
- New / LPM-trie maps in xdp_maps.h.
Same pinning + size as the existing banned_ips maps so they
survive a process restart and grow lazily.
- now checks allowed_ips BEFORE banned_ips: a hit
short-circuits to XDP_PASS, defeating any banned-ip entry from
config or from amygdala / WAF runtime ban.
- SYNAPSEFirewall::{allow_ip,unallow_ip,allow_ipv6,unallow_ipv6}
push deltas to the LPM maps; mirror
for the pinned-map reconcile on
startup.
nftables path
- sets created in the synapse table, with
rules inserted before the existing drop rules in
. nft evaluates rules in order so accept wins.
iptables path
- Allow rules go in as at position 1 of
SYNAPSE_BLOCK so they run before any DROP rule and the packet
falls back to INPUT for the rest of the host policy.
apply_rules (every backend) now parses access_rules.allow.{ips,
country,asn} alongside block, diffs against per-process previous
state, and pushes the deltas via the new Firewall-trait methods.
Allow state is reconciled against the live kernel map once per
process (same crash-survivability pattern as block).
Userspace predicate updated:
- is restored (the dead-function
removal from the previous commit is reverted — it stops being
dead the moment we wire it).
- checks allow FIRST, so a
runtime ban (DashSet) on an allow-listed IP also reports
not-blocked.
Tests (synapse-access-rules, 8 new):
- direct IPv4 match, CIDR match, country list, ASN list
- allow overrides block at same IP
- allow beats runtime blocklist
- block-only IP is still blocked
- IPv6 CIDR allow overrides IPv6 block
Serialized with serial_test because they mutate the
process-global Config singleton.
Wellness check (matches .github/workflows/wellness-check.yaml) all
green locally: fmt, machete, clippy --workspace --all-targets
(default + classifier feature), cargo doc with -D warnings, and
cargo test --workspace --lib.
One-shot example binary that wires Config -> Logs SDK -> OTLP/HTTP
the same way Synapse does in production and emits five synthetic
BlockEvents covering every (layer, action) pair the block-counter
buckets on. Used to validate the full Synapse -> telemetry-api ->
kafka -> workflow chain locally.
Run from the repo root:
cargo run --example e2e_emit \
--manifest-path crates/synapse-telemetry/Cargo.toml -- \
<telemetry-api-url> <api-key>
The example uses block_on + tokio::time::sleep so the
BatchLogProcessor's flush task actually runs before the runtime
tears down (std::thread::sleep on a current-thread runtime would
block the executor and silently drop the batch).
Until now the OTLP Logs and Metrics exporters reported themselves to
the platform as 'OTel OTLP Exporter Rust/<otel-version>' — the
opentelemetry-otlp default. The worker's config-poll client (built
via synapse_core::utils::http_client) already sends
'Synapse/<CARGO_PKG_VERSION>'. Two outbound clients, two agent
identities — a paper cut in access-log correlation and an awkward
extra entry for any allowlist the platform side maintains.
Unified: a new crate::otlp::apply_synapse_headers helper sets
User-Agent to synapse_user_agent() ('Synapse/<version>') on every
OTLP exporter builder before threading the caller's headers
(typically Authorization) through. logs_sdk::init and metrics::init
both go through it. A caller-supplied User-Agent in OtlpConfig.headers
still overrides — useful when a tenant routes through a proxy that
requires a specific string.
Workaround for an upstream bug: opentelemetry-otlp 0.27's
WithHttpConfig::with_headers uses Option::iter_mut().zip(headers)
where the Option has at most 1 element. The zip yields exactly one
pair regardless of how many entries the caller supplies, so every
header after the first is silently dropped. apply_synapse_headers
walks around that by calling with_headers once per single-entry
HashMap, so each call actually inserts. Re-evaluate when we move
past 0.27.
Verified end-to-end on the local stack: the
synapse-telemetry::e2e_emit example POSTs to telemetry-api, the
access log now reports user_agent='Synapse/0.1.0' (was 'OTel OTLP
Exporter Rust/0.27.0'), Authorization is still validated, the
event reaches Kafka, and the workflow-service block counter writes
the bucket with every dimension.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.