Skip to content

feat: initial implementation of QUIC agent tunnel#1738

Merged
Benoît Cortier (CBenoit) merged 21 commits intomasterfrom
feat/quic-tunnel-1-core
Apr 21, 2026
Merged

feat: initial implementation of QUIC agent tunnel#1738
Benoît Cortier (CBenoit) merged 21 commits intomasterfrom
feat/quic-tunnel-1-core

Conversation

@irvingoujAtDevolution
Copy link
Copy Markdown
Contributor

@irvingoujAtDevolution irvingouj@Devolutions (irvingoujAtDevolution) commented Apr 2, 2026

Summary

QUIC-based agent tunnel (PR 1 of 4). Agents in private networks connect outbound to Gateway via QUIC/mTLS, advertise reachable subnets and domains, and proxy TCP connections. Pure Rust (Quinn + rustls), zero C dependencies.

See Technical Spec for protocol details.

PR stack

  1. Protocol + Tunnel Core (this PR)
  2. Transparent Routing
  3. Auth + Webapp
  4. Deployment + Installer

Highlights

  • Quinn QUIC transport with mTLS (private PKI)
  • CSR-based enrollment (private key never leaves agent)
  • Auto-reconnect with exponential backoff
  • AD domain auto-detection
  • Bounded deserialization, buffer limits, connection limits
  • 32 + 15 tests

🤖 Generated with Claude Code

@irvingoujAtDevolution
Copy link
Copy Markdown
Contributor Author

QUIC Agent Tunnel — Technical Specification

1. Enrollment

How an agent gets its certificate

Admin                        Agent Machine                  Gateway
  │                              │                             │
  │  Click "Enroll Agent"        │                             │
  │  in DVLS / Gateway webapp    │                             │
  │                              │                             │
  │  Copy enrollment string      │                             │
  │  dgw-enroll:v1:base64...    │                             │
  │                              │                             │
  │  Paste into MSI installer    │                             │
  │  or CLI command              │                             │
  │                              │                             │
  │                              │  1. Decode enrollment string │
  │                              │     → gateway URL            │
  │                              │     → one-time token         │
  │                              │     → QUIC endpoint          │
  │                              │                             │
  │                              │  2. Generate ECDSA P-256     │
  │                              │     key pair LOCALLY         │
  │                              │     Write key to disk (0600) │
  │                              │                             │
  │                              │  3. Generate CSR             │
  │                              │     (public key + signature) │
  │                              │                             │
  │                              │── POST /enroll ────────────>│
  │                              │   { agent_name, csr_pem }   │
  │                              │   Bearer: <one-time-token>  │
  │                              │                             │
  │                              │                             │  4. Validate token
  │                              │                             │     (consumed, cannot replay)
  │                              │                             │  5. Verify CSR signature
  │                              │                             │  6. Assign agent UUID
  │                              │                             │  7. Sign cert with CA
  │                              │                             │     (embed UUID in SAN)
  │                              │                             │
  │                              │<────────────────────────────│
  │                              │   { agent_id, cert_pem,     │
  │                              │     ca_cert_pem, endpoint } │
  │                              │                             │
  │                              │  8. Write cert + CA cert     │
  │                              │  9. Update agent.json        │
  │                              │     (Tunnel section only,    │
  │                              │      preserves other config) │
  │                              │                             │
  │                              │  10. Connect via QUIC ──────>│
  │                              │      (mTLS with new cert)    │

Key property: the private key never leaves the agent machine.
Only the CSR (containing the public key and a proof-of-possession signature) is transmitted.
The enrollment response contains only the signed certificate and the CA certificate — no secrets.

Enrollment token

The enrollment token is either:

  • A one-time UUID (122-bit entropy) generated by the gateway — consumed atomically on use, cannot be replayed.
  • A static secret from gateway configuration — compared in constant time.

2. Stream Multiplexing

One QUIC connection, many independent streams

Agent ←──── single QUIC connection ────→ Gateway
             │
             ├── Stream 0 (control, always open)
             │   Agent → GW:  RouteAdvertise every 30s
             │   Agent → GW:  Heartbeat every 60s
             │   GW → Agent:  HeartbeatAck
             │
             ├── Stream 1 (RDP session #1)
             │   GW → Agent:  ConnectMessage { target: 10.0.0.5:3389 }
             │   Agent → GW:  ConnectResponse::Success
             │   Then: raw bidirectional bytes (RDP protocol data)
             │
             ├── Stream 5 (SSH session #1)
             │   GW → Agent:  ConnectMessage { target: 10.0.0.10:22 }
             │   Agent → GW:  ConnectResponse::Success
             │   Then: raw bidirectional bytes (SSH protocol data)
             │
             └── Stream 9 (SSH session #2)
                 GW → Agent:  ConnectMessage { target: 10.0.0.20:22 }
                 Agent → GW:  ConnectResponse::Success
                 Then: raw bidirectional bytes (SSH protocol data)

Each stream is independently ordered.
A retransmission on stream 1 does not block streams 5 or 9.
This is QUIC's core advantage over TCP — no head-of-line blocking across streams.

How a new session is established

  1. Gateway allocates the next server-initiated stream ID (1, 5, 9, 13, …).
  2. Gateway writes a length-prefixed ConnectMessage to the new stream.
  3. Agent reads the stream, decodes the ConnectMessage.
  4. Agent validates the target IP is within its advertised subnets (security boundary).
  5. Agent opens a TCP connection to the target.
  6. Agent writes ConnectResponse::Success back on the same stream.
  7. From this point, every byte on the QUIC stream is forwarded 1:1 to/from the TCP connection.

No new QUIC handshake is needed — streams are opened instantly on the existing connection.

Message encoding

All control and session setup messages use length-prefixed bincode:

┌─────────────────────────┬──────────────────────────────┐
│ 4 bytes (big-endian u32)│ N bytes (bincode payload)    │
│ message_length = N      │                              │
└─────────────────────────┴──────────────────────────────┘

After ConnectResponse::Success, the stream carries raw bytes — no framing, no headers.
The gateway and agent act as transparent TCP proxies.

Size limits

Message type Max size Purpose
Control messages 1 MiB RouteAdvertise, Heartbeat
Session messages 64 KiB ConnectMessage, ConnectResponse

Limits are enforced on the length prefix (before reading the payload) and on the bincode deserializer (prevents crafted payloads with huge internal Vec lengths).

3. User Experience

Network topology

┌─────────────────────────────────────────────────────────┐
│  Cloud                                                   │
│  ┌──────────────────┐                                   │
│  │ Devolutions      │                                   │
│  │ Gateway          │  ← publicly reachable              │
│  │ gateway.acme.com │                                   │
│  └────────┬─────────┘                                   │
│           │ QUIC (UDP 4433)                              │
└───────────┼─────────────────────────────────────────────┘
            │
       ─ ─ ─│─ ─ ─ ─ ─ ─ firewall (outbound only) ─ ─ ─ ─
            │
┌───────────┼─────────────────────────────────────────────┐
│  Office   │                                              │
│  ┌────────┴─────────┐    ┌──────────┐  ┌──────────┐    │
│  │ Agent            │    │ DC       │  │ File     │    │
│  │ 10.10.0.8        │───→│ 10.10.0.3│  │ Server   │    │
│  │ advertises:      │    │ (RDP+KDC)│  │ 10.10.0.5│    │
│  │  10.10.0.0/24    │    └──────────┘  └──────────┘    │
│  │  contoso.local   │                                   │
│  └──────────────────┘                                   │
└─────────────────────────────────────────────────────────┘

Admin setup (one-time)

  1. Open Gateway webapp → Agents → Enroll Agent.
  2. Copy the enrollment string.
  3. On the agent machine: devolutions-agent up --enrollment-string "dgw-enroll:v1:...".
  4. Agent enrolls, connects, starts advertising 10.10.0.0/24 + contoso.local.

End-user workflow (daily use)

The user has no awareness of the agent. From their perspective:

  1. Open RDM or Gateway webapp.
  2. Create an RDP connection to 10.10.0.3.
  3. Click connect.
  4. The RDP desktop appears.

What happens behind the scenes:

User's browser
  → WebSocket to Gateway (gateway.acme.com)
    → Gateway routing: 10.10.0.3 matches agent's 10.10.0.0/24 subnet
      → Gateway opens QUIC stream 5 to agent
        → ConnectMessage { target: "10.10.0.3:3389" }
          → Agent connects TCP to 10.10.0.3:3389
            → ConnectResponse::Success
              → RDP data flows bidirectionally

No VPN. No inbound firewall rules on the office network. No routing configuration.

Transparent routing rules

When a connection request arrives, the gateway evaluates routing in priority order:

  1. Explicit agent ID — if the session token contains jet_agent_id, route to that specific agent.
  2. IP subnet match — if the target is an IP address, find agents whose advertised subnets contain it.
  3. Domain suffix match — if the target is a hostname, find agents whose advertised domains match by longest suffix (e.g., db01.finance.contoso.local matches finance.contoso.local over contoso.local).
  4. No match — direct connection (gateway connects to the target itself, no tunnel).

When multiple agents match the same target, the most recently seen agent is tried first.
If it fails, the next candidate is tried (automatic failover).

Resilience

  • Agent auto-reconnects if the QUIC connection drops (exponential backoff, 1s–60s, with jitter).
  • Config re-read on every reconnection attempt (admin can change subnets without restarting the service).
  • Heartbeat monitoring — agents are marked offline after 90 seconds without a heartbeat.
  • Graceful shutdown — agent sends QUIC close frame, gateway immediately unregisters it from routing.

1 similar comment
@irvingoujAtDevolution
Copy link
Copy Markdown
Contributor Author

QUIC Agent Tunnel — Technical Specification

1. Enrollment

How an agent gets its certificate

Admin                        Agent Machine                  Gateway
  │                              │                             │
  │  Click "Enroll Agent"        │                             │
  │  in DVLS / Gateway webapp    │                             │
  │                              │                             │
  │  Copy enrollment string      │                             │
  │  dgw-enroll:v1:base64...    │                             │
  │                              │                             │
  │  Paste into MSI installer    │                             │
  │  or CLI command              │                             │
  │                              │                             │
  │                              │  1. Decode enrollment string │
  │                              │     → gateway URL            │
  │                              │     → one-time token         │
  │                              │     → QUIC endpoint          │
  │                              │                             │
  │                              │  2. Generate ECDSA P-256     │
  │                              │     key pair LOCALLY         │
  │                              │     Write key to disk (0600) │
  │                              │                             │
  │                              │  3. Generate CSR             │
  │                              │     (public key + signature) │
  │                              │                             │
  │                              │── POST /enroll ────────────>│
  │                              │   { agent_name, csr_pem }   │
  │                              │   Bearer: <one-time-token>  │
  │                              │                             │
  │                              │                             │  4. Validate token
  │                              │                             │     (consumed, cannot replay)
  │                              │                             │  5. Verify CSR signature
  │                              │                             │  6. Assign agent UUID
  │                              │                             │  7. Sign cert with CA
  │                              │                             │     (embed UUID in SAN)
  │                              │                             │
  │                              │<────────────────────────────│
  │                              │   { agent_id, cert_pem,     │
  │                              │     ca_cert_pem, endpoint } │
  │                              │                             │
  │                              │  8. Write cert + CA cert     │
  │                              │  9. Update agent.json        │
  │                              │     (Tunnel section only,    │
  │                              │      preserves other config) │
  │                              │                             │
  │                              │  10. Connect via QUIC ──────>│
  │                              │      (mTLS with new cert)    │

Key property: the private key never leaves the agent machine.
Only the CSR (containing the public key and a proof-of-possession signature) is transmitted.
The enrollment response contains only the signed certificate and the CA certificate — no secrets.

Enrollment token

The enrollment token is either:

  • A one-time UUID (122-bit entropy) generated by the gateway — consumed atomically on use, cannot be replayed.
  • A static secret from gateway configuration — compared in constant time.

2. Stream Multiplexing

One QUIC connection, many independent streams

Agent ←──── single QUIC connection ────→ Gateway
             │
             ├── Stream 0 (control, always open)
             │   Agent → GW:  RouteAdvertise every 30s
             │   Agent → GW:  Heartbeat every 60s
             │   GW → Agent:  HeartbeatAck
             │
             ├── Stream 1 (RDP session #1)
             │   GW → Agent:  ConnectMessage { target: 10.0.0.5:3389 }
             │   Agent → GW:  ConnectResponse::Success
             │   Then: raw bidirectional bytes (RDP protocol data)
             │
             ├── Stream 5 (SSH session #1)
             │   GW → Agent:  ConnectMessage { target: 10.0.0.10:22 }
             │   Agent → GW:  ConnectResponse::Success
             │   Then: raw bidirectional bytes (SSH protocol data)
             │
             └── Stream 9 (SSH session #2)
                 GW → Agent:  ConnectMessage { target: 10.0.0.20:22 }
                 Agent → GW:  ConnectResponse::Success
                 Then: raw bidirectional bytes (SSH protocol data)

Each stream is independently ordered.
A retransmission on stream 1 does not block streams 5 or 9.
This is QUIC's core advantage over TCP — no head-of-line blocking across streams.

How a new session is established

  1. Gateway allocates the next server-initiated stream ID (1, 5, 9, 13, …).
  2. Gateway writes a length-prefixed ConnectMessage to the new stream.
  3. Agent reads the stream, decodes the ConnectMessage.
  4. Agent validates the target IP is within its advertised subnets (security boundary).
  5. Agent opens a TCP connection to the target.
  6. Agent writes ConnectResponse::Success back on the same stream.
  7. From this point, every byte on the QUIC stream is forwarded 1:1 to/from the TCP connection.

No new QUIC handshake is needed — streams are opened instantly on the existing connection.

Message encoding

All control and session setup messages use length-prefixed bincode:

┌─────────────────────────┬──────────────────────────────┐
│ 4 bytes (big-endian u32)│ N bytes (bincode payload)    │
│ message_length = N      │                              │
└─────────────────────────┴──────────────────────────────┘

After ConnectResponse::Success, the stream carries raw bytes — no framing, no headers.
The gateway and agent act as transparent TCP proxies.

Size limits

Message type Max size Purpose
Control messages 1 MiB RouteAdvertise, Heartbeat
Session messages 64 KiB ConnectMessage, ConnectResponse

Limits are enforced on the length prefix (before reading the payload) and on the bincode deserializer (prevents crafted payloads with huge internal Vec lengths).

3. User Experience

Network topology

┌─────────────────────────────────────────────────────────┐
│  Cloud                                                   │
│  ┌──────────────────┐                                   │
│  │ Devolutions      │                                   │
│  │ Gateway          │  ← publicly reachable              │
│  │ gateway.acme.com │                                   │
│  └────────┬─────────┘                                   │
│           │ QUIC (UDP 4433)                              │
└───────────┼─────────────────────────────────────────────┘
            │
       ─ ─ ─│─ ─ ─ ─ ─ ─ firewall (outbound only) ─ ─ ─ ─
            │
┌───────────┼─────────────────────────────────────────────┐
│  Office   │                                              │
│  ┌────────┴─────────┐    ┌──────────┐  ┌──────────┐    │
│  │ Agent            │    │ DC       │  │ File     │    │
│  │ 10.10.0.8        │───→│ 10.10.0.3│  │ Server   │    │
│  │ advertises:      │    │ (RDP+KDC)│  │ 10.10.0.5│    │
│  │  10.10.0.0/24    │    └──────────┘  └──────────┘    │
│  │  contoso.local   │                                   │
│  └──────────────────┘                                   │
└─────────────────────────────────────────────────────────┘

Admin setup (one-time)

  1. Open Gateway webapp → Agents → Enroll Agent.
  2. Copy the enrollment string.
  3. On the agent machine: devolutions-agent up --enrollment-string "dgw-enroll:v1:...".
  4. Agent enrolls, connects, starts advertising 10.10.0.0/24 + contoso.local.

End-user workflow (daily use)

The user has no awareness of the agent. From their perspective:

  1. Open RDM or Gateway webapp.
  2. Create an RDP connection to 10.10.0.3.
  3. Click connect.
  4. The RDP desktop appears.

What happens behind the scenes:

User's browser
  → WebSocket to Gateway (gateway.acme.com)
    → Gateway routing: 10.10.0.3 matches agent's 10.10.0.0/24 subnet
      → Gateway opens QUIC stream 5 to agent
        → ConnectMessage { target: "10.10.0.3:3389" }
          → Agent connects TCP to 10.10.0.3:3389
            → ConnectResponse::Success
              → RDP data flows bidirectionally

No VPN. No inbound firewall rules on the office network. No routing configuration.

Transparent routing rules

When a connection request arrives, the gateway evaluates routing in priority order:

  1. Explicit agent ID — if the session token contains jet_agent_id, route to that specific agent.
  2. IP subnet match — if the target is an IP address, find agents whose advertised subnets contain it.
  3. Domain suffix match — if the target is a hostname, find agents whose advertised domains match by longest suffix (e.g., db01.finance.contoso.local matches finance.contoso.local over contoso.local).
  4. No match — direct connection (gateway connects to the target itself, no tunnel).

When multiple agents match the same target, the most recently seen agent is tried first.
If it fails, the next candidate is tried (automatic failover).

Resilience

  • Agent auto-reconnects if the QUIC connection drops (exponential backoff, 1s–60s, with jitter).
  • Config re-read on every reconnection attempt (admin can change subnets without restarting the service).
  • Heartbeat monitoring — agents are marked offline after 90 seconds without a heartbeat.
  • Graceful shutdown — agent sends QUIC close frame, gateway immediately unregisters it from routing.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the first slice of a QUIC/mTLS “agent tunnel” system: a shared binary protocol crate, a Gateway-side QUIC listener/registry/enrollment API, and an Agent-side enrollment + reconnecting tunnel client. This enables routing Gateway-initiated TCP proxy sessions through outbound-connected agents (for private-network reachability).

Changes:

  • Introduces agent-tunnel-proto crate (control/session messages, framing, protocol versioning).
  • Adds Gateway agent-tunnel core (agent_tunnel module), config wiring, REST endpoints, and token claim support (jet_agent_id) used in the forwarding path.
  • Adds Agent enrollment/bootstrap + QUIC tunnel client with auto-reconnect and domain auto-detection.

Reviewed changes

Copilot reviewed 35 out of 36 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
devolutions-gateway/tests/config.rs Updates config samples to include agent_tunnel field.
devolutions-gateway/src/token.rs Adds jet_agent_id to association claims; adjusts scope token claims serialization/visibility.
devolutions-gateway/src/service.rs Initializes and registers the agent-tunnel listener task when enabled.
devolutions-gateway/src/ngrok.rs Threads agent_tunnel_handle into the TCP tunnel client path.
devolutions-gateway/src/middleware/auth.rs Adds auth exception for /jet/agent-tunnel/enroll (self-auth via bearer token).
devolutions-gateway/src/listener.rs Threads agent_tunnel_handle into the generic client path.
devolutions-gateway/src/lib.rs Exposes agent_tunnel module and adds agent_tunnel_handle to DgwState.
devolutions-gateway/src/generic_client.rs Uses jet_agent_id to route Fwd connections through the agent tunnel.
devolutions-gateway/src/extract.rs Adds request extractors for agent-management read/write access control.
devolutions-gateway/src/config.rs Adds AgentTunnelConf to Gateway config DTO and runtime config.
devolutions-gateway/src/api/webapp.rs Ensures new jet_agent_id claim is present (set to None) when minting tokens.
devolutions-gateway/src/api/mod.rs Nests the new /jet/agent-tunnel/* router.
devolutions-gateway/src/api/agent_enrollment.rs Implements enrollment + agent management endpoints (list/get/delete/resolve-target).
devolutions-gateway/src/agent_tunnel/mod.rs Declares agent-tunnel submodules and re-exports core types.
devolutions-gateway/src/agent_tunnel/listener.rs QUIC UDP listener event loop + proxy-stream request dispatching.
devolutions-gateway/src/agent_tunnel/enrollment_store.rs In-memory single-use enrollment token store with expiry.
devolutions-gateway/src/agent_tunnel/stream.rs Tokio AsyncRead/AsyncWrite wrapper over QUIC streams via channels.
devolutions-gateway/src/agent_tunnel/registry.rs Agent registry with heartbeat liveness + subnet/domain routing selection.
devolutions-gateway/src/agent_tunnel/connection.rs Managed quiche connection: handshake identity, control parsing, proxy stream setup.
devolutions-gateway/src/agent_tunnel/cert.rs CA manager for enrollment signing + server cert issuance and cert parsing helpers.
devolutions-gateway/Cargo.toml Adds QUIC/proto/cert/routing dependencies for the tunnel feature.
devolutions-agent/src/service.rs Registers TunnelTask when tunnel is enabled; fixes conf_handle cloning for RDP task.
devolutions-agent/src/main.rs Adds CLI support for enroll/up bootstrap flows and parsing helpers + tests.
devolutions-agent/src/lib.rs Exposes new modules: tunnel, enrollment, domain_detect.
devolutions-agent/src/enrollment.rs Implements enrollment request + persistence of certs/config merge.
devolutions-agent/src/domain_detect.rs Adds Windows/Linux DNS domain auto-detection helper.
devolutions-agent/src/tunnel.rs Implements reconnecting QUIC client + control/session stream handling and TCP proxying.
devolutions-agent/src/config.rs Adds tunnel config section; makes save_config/get_conf_file_path public.
devolutions-agent/Cargo.toml Adds proto/quiche/reqwest/rcgen dependencies and Windows feature for domain detection.
crates/agent-tunnel-proto/src/lib.rs Defines the protocol crate API surface and exports.
crates/agent-tunnel-proto/src/version.rs Adds protocol version constants + validation helper.
crates/agent-tunnel-proto/src/error.rs Defines protocol-level error types.
crates/agent-tunnel-proto/src/control.rs Adds control-plane message definitions + framed encode/decode.
crates/agent-tunnel-proto/src/session.rs Adds session-plane message definitions + framed encode/decode.
crates/agent-tunnel-proto/Cargo.toml New crate manifest and dependencies.
Cargo.lock Locks new dependencies introduced for QUIC, cert handling, registry, and protocol crate.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread devolutions-gateway/src/api/agent_enrollment.rs Outdated
Comment thread devolutions-gateway/src/api/agent_enrollment.rs Outdated
Comment thread devolutions-gateway/src/extract.rs
Comment thread devolutions-gateway/src/agent_tunnel/listener.rs Outdated
Comment thread devolutions-gateway/src/agent_tunnel/connection.rs Outdated
Comment thread crates/agent-tunnel-proto/src/session.rs Outdated
Comment thread crates/agent-tunnel-proto/src/control.rs Outdated
Comment thread devolutions-agent/src/tunnel.rs Outdated
Comment thread devolutions-agent/src/tunnel.rs Outdated
Comment thread devolutions-gateway/src/generic_client.rs
Comment thread devolutions-gateway/src/api/mod.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 38 out of 39 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/agent-tunnel/src/cert.rs
Comment thread crates/agent-tunnel/src/listener.rs
Comment thread devolutions-agent/src/tunnel.rs
Comment thread crates/agent-tunnel-proto/src/stream.rs Outdated
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Hoist protocol version validation before match in both gateway and
  agent control loops (single check, no per-variant boilerplate)
- Validate ConnectResponse protocol version in connect_via_agent
- ServerCertStatus enum for ensure_server_cert (expiry + hostname SAN)
- send.finish() after proxy copy (graceful QUIC EOF)
- Fix constant_time_eq doc (inaccurate timing claim)
- Extract ALPN to agent_tunnel_proto::ALPN_PROTOCOL constant
- Destruct EnrollResponse at parameter level for readability
- ValidatedTunnelConf: make wrong state unrepresentable at type level
  (dto::TunnelConf for JSON, TunnelConf for runtime with non-optional fields)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread devolutions-agent/src/main.rs Outdated
Comment thread crates/agent-tunnel-proto/Cargo.toml
Comment thread devolutions-gateway/Cargo.toml Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I see a lot of new dependencies. Maybe reevaluate the dependencies what is absolutely necessary and what could be removed. I see pull multiple libraries to parse PEM files… Pretty sure we already had something before pem and rustls-pem.

Comment thread devolutions-gateway/Cargo.toml
- `dashmap::DashMap` → `tokio::sync::RwLock<HashMap>` in
  `enrollment_store`, `listener`, `registry`. All lookups/inserts await
  the lock; values are cloned out (Arc/quinn::Connection) so no guards
  escape the critical section.
- `pem` crate → `rustls_pemfile::certs` via a small `cert_pem_to_der`
  helper in `cert.rs`. The CSR tamper test now uses `base64` directly
  for PEM encode/decode.
- `bincode` + `serde` → hand-rolled binary encoding in
  `agent-tunnel-proto`, following the `jmux-proto` pattern:
  * `FramedSend<S>` / `FramedRecv<R>` handle length-prefixed framing
    and encode/decode via private `Encode` / `Decode` traits.
  * `ControlStream` / `SessionStream` compose `FramedSend` +
    `FramedRecv` with their respective max frame sizes; no more free
    `write_framed` / `read_framed` helpers.
  * Each `ControlMessage` / `ConnectRequest` / `ConnectResponse`
    variant has an explicit wire layout with tag bytes, big-endian
    integers, u32-length-prefixed strings, and explicit IPv4 framing.
  * `serde` becomes an optional feature on the proto crate, enabled by
    `devolutions-gateway` for its JSON API (`DomainAdvertisement`
    serialization); `devolutions-agent` drops it entirely.

All 18 proto tests (roundtrip + proptest) pass unchanged.
Addresses Benoit's review comment: "Switch to JWT instead of a custom
format". The old `dgw-enroll:v1:<base64-JSON>` envelope is replaced with
a standard JWT that carries the same information via JWT claims and
doubles as the Bearer token for `/jet/tunnel/enroll`.

Gateway:
- Add `AccessScope::TunnelEnroll` and a dedicated `EnrollmentTokenClaims`
  struct with `scope`, `exp`, `jti`, `jet_gw_url` (required) and
  `jet_agent_name` (optional). The `jet_*` prefix matches the existing
  convention for gateway-specific custom claims (`jet_aid`, `jet_ap`,
  `jet_gw_id`, ...).
- Add `validate_enrollment_jwt` in `api/tunnel.rs` (ported from feature
  branch). Verifies signature against `provisioner_public_key`, checks
  `exp`/`nbf` via picky's strict validator, and enforces scope is
  `TunnelEnroll` or `Wildcard`.
- `enroll_agent` now tries JWT first, then the one-time token store,
  then the static `enrollment_secret` as a fallback.
- 7 unit tests cover the happy path, wildcard scope, wrong scope,
  expiry, signature mismatch, missing required claim, and malformed
  input.

Agent:
- Replace `EnrollmentStringPayload` / `parse_enrollment_string` with
  `EnrollmentJwtClaims` / `parse_enrollment_jwt`. The parser splits on
  `.` and decodes the payload segment without verifying the signature
  (agent is the intended recipient; the Gateway verifies on enrollment).
- The JWT string itself becomes the Bearer token — no more separate
  `enrollment_token` field nested inside a custom envelope.
- 3 tests: happy path via `parse_up_command_args`, malformed rejection,
  and missing-`jet_gw_url` rejection.

Also fixes the pre-existing inline registry tests that broke in the
previous commit when `DashMap` → `tokio::sync::RwLock<HashMap>` made
`AgentRegistry` methods async and `DomainAdvertisement.domain` became
a `DomainName` newtype.
Addresses Benoit's review comment: "Extract more logic into a separate
crate, the same way we did for the network scanner. `agent-tunnel-proto`
(already existing) + `agent-tunnel`."

The agent tunnel module was already self-contained (zero `use crate::*`
imports), so the extraction is a mechanical move:

- Create `crates/agent-tunnel/` as a new workspace crate
- Move `cert.rs`, `enrollment_store.rs`, `listener.rs`, `registry.rs`,
  `stream.rs` from `devolutions-gateway/src/agent_tunnel/` (git tracks
  these as renames)
- New `lib.rs` does the `#[macro_use] extern crate tracing` dance and
  re-exports the public surface (`AgentTunnelHandle`,
  `AgentTunnelListener`, `AgentRegistry`, `EnrollmentTokenStore`,
  `TunnelStream`)
- Delete `devolutions-gateway/src/agent_tunnel/mod.rs`
- Gateway now depends on `agent-tunnel` as a path dependency; call
  sites change `crate::agent_tunnel::*` → `agent_tunnel::*`

Also promote `Encode` / `Decode` in `agent-tunnel-proto::codec` from
`pub(crate)` to `pub` so `FramedSend::send` / `FramedRecv::recv` (which
bound on them) are reachable in the new crate without `private_bounds`
warnings.

Tests: 20 moved from gateway inline into the new crate and all still
pass; gateway still has 64 lib tests + all integration tests green;
agent + proto tests untouched.
Review-agent findings addressed:

- Drop `ControlMessage`/`ConnectRequest`/`ConnectResponse` inherent
  `encode`/`decode` methods. They duplicated the `Encode`/`Decode`
  trait impls with identical signatures, so callsites and rustdoc saw
  two methods for one job. Only the trait impls remain; stream wrappers
  already go through the traits.
- `RouteAdvertisementState::update_routes` same-epoch branch now logs
  the *incoming* subnet/domain counts (previously re-logged the
  existing state's count, which read as if we had accepted the new
  set) and makes it explicit in the message that incoming routes are
  ignored.
- Rename `constant_time_eq` → `timing_safe_eq`. The function hashes
  inputs first and only the 32-byte digest compare is constant-time.
  New name describes intent; doc comment now explains both what the
  hash normalization buys and what the function does *not* guarantee.
- Document that `EnrollmentTokenStore::redeem` removes expired tokens
  as a side effect (so callers cannot distinguish "missing" from
  "expired", and shouldn't).
- Explain in `parse_enrollment_jwt` why we handroll the split/decode
  instead of pulling `picky` into the agent for unverified payload
  reading.
- Move `use agent_tunnel_proto::current_time_millis;` to the top of
  `registry.rs` with the other imports (was dangling at module bottom
  after the IPv4-only revert).
- Apply `cargo fmt`.

Tests: 20 agent-tunnel + 13 agent-tunnel-proto + 5 session_roundtrip
+ 64 gateway lib + 5 devolutions-agent, all green. Zero clippy
warnings on the changed crates.
- Drop the 1-byte IP family tag from each subnet on the wire. The type
  is `Ipv4Network` so the tag could only ever be `0x04`. Encoding it
  was a TODO-by-bytes that would have constrained a future v2 without
  helping v1. Each subnet is now `[4B ipv4_octets][1B prefix]` — saves
  a byte per subnet per RouteAdvertise. If IPv6 arrives, the wire bump
  comes with a `protocol_version` bump and the format can reintroduce
  a tag cleanly.
- Add six unit tests for `DomainName::matches_hostname` covering exact
  match, case insensitivity, suffix match, rejected partial-label
  ("fakecontoso.local" vs "contoso.local"), unrelated hostname, and
  parent vs child domain. The method is only called from PR2's routing
  code; these tests make sure the algorithm is locked down on PR1 so
  the PR2 consumer can rely on it.
- `devolutions-agent/src/tunnel.rs`: replace the `continue;` on
  backoff exhaustion with a fall-through using a 1s floor. Previously,
  if `backoff.next_backoff()` ever returned `None` (supposedly
  unreachable with `max_elapsed_time(None)`), the loop would spin
  without any sleep. Defensive fix, not a correctness one.

All 20 agent-tunnel / 19 agent-tunnel-proto / 64 gateway-lib / 5
session_roundtrip / 5 devolutions-agent tests still pass. Zero clippy
warnings on the changed crates.
Separate what types are from how they go on the wire, and align the
crate with the gateway test layout (all tests in `tests/`, zero inline
tests in `src/`, matching `jmux-proto` and `devolutions-gateway`).

- `src/control_codec.rs` (new): `TAG_*` constants, `impl Encode /
  Decode for ControlMessage`, plus the private `encode_subnets /
  decode_subnets / encode_domains / decode_domains /
  encode_cert_renewal_result / decode_cert_renewal_result` helpers.
- `src/control.rs`: now just type definitions, constructors, and the
  `protocol_version()` accessor. Reads as a pure data contract.
- `src/session_codec.rs` (new): `TAG_*` constants + `impl Encode /
  Decode for ConnectRequest / ConnectResponse`.
- `src/session.rs`: types + constructors + `is_success` /
  `protocol_version` only.
- `src/lib.rs`: registers the two new `pub(crate)` codec modules and
  re-exports `CertRenewalResult` at the crate root (needed by the new
  external test file).
- `tests/control.rs` (new): the six `DomainName::matches_hostname`
  tests, nine `ControlMessage` roundtrip tests, the oversized-message
  rejection test, and the proptest — all moved out of `src/control.rs`
  so the production file no longer has a trailing `#[cfg(test)]`
  block. Uses only the public API.
- `tests/session_roundtrip.rs` → `tests/session.rs` (git rename) for
  naming symmetry with `tests/control.rs`.

Tests still green: 3 (version) + 16 (control integration) +
5 (session integration) on agent-tunnel-proto; 20 on agent-tunnel and
84 on devolutions-gateway lib untouched. Zero clippy warnings; fmt
clean.
Roundtrip + proptest already catch correctness bugs, but three classes
of issue were only weakly covered: accidental wire-format drift, decode
error reporting, and send-side size enforcement. Fills those gaps.

control tests (`tests/control.rs`):

- Wire format lock-in: exact-bytes assertions for `Heartbeat` and
  `HeartbeatAck`. If anyone changes a tag value, field order, or
  integer endianness, these fail loudly instead of being absorbed by
  the roundtrip tests.
- Negative decode paths (five tests, each constructs a malformed
  `[4B length][payload]` frame and asserts the specific `ProtoError`
  variant):
  * unknown top-level message tag → `UnknownTag`
  * truncated `Heartbeat` payload → `Truncated`
  * non-UTF-8 string in `CertRenewalRequest::csr_pem` →
    `InvalidField { field: "string", .. }`
  * IPv4 prefix > 32 in a `RouteAdvertise` subnet →
    `InvalidField { field: "subnet", .. }`
  * unknown `CertRenewalResult` sub-tag → `UnknownTag`
- Send-side size enforcement: building a `CertRenewalRequest` whose
  encoded form exceeds `MAX_CONTROL_MESSAGE_SIZE` and asserting
  `send` returns `MessageTooLarge`.

session tests (`tests/session.rs`):

- Wire format lock-in for `ConnectRequest` (with a deterministic UUID
  and `"host:80"` target) and `ConnectResponse::Success`.
- Unknown-tag rejection for `ConnectResponse`.
- Truncated `ConnectRequest` decode returns `Truncated`.
- Oversized `ConnectRequest` send returns `MessageTooLarge`
  (exceeds `MAX_SESSION_MESSAGE_SIZE` = 64 KiB).

Totals: control integration 16 → 24; session integration 5 → 10.
All 34 integration + 3 version tests pass. Zero clippy warnings;
fmt clean.
Symmetry with `ControlMessage` and `ConnectResponse`: both are enums
with tag bytes, so adding a new variant is additive on the wire — a v1
decoder sees an unknown tag and returns `UnknownTag` cleanly.

`ConnectRequest` was the odd one out: a fixed-layout struct that could
only ever mean "connect TCP to target". Adding a future session
opener — SOCKS5, UDP forward, authenticated connect, multi-target
try-each — would have required a version bump and branch-on-version
decoders, instead of the cleaner tag-based dispatch the other enums
get.

Changes:

- `session.rs`: `ConnectRequest` becomes an enum with one `Tcp`
  variant. Kept accessor methods (`protocol_version`, `session_id`,
  `target`) so call sites stay readable — matches the
  `ControlMessage::protocol_version()` pattern.
- `session_codec.rs`: introduces `TAG_REQUEST_TCP = 0x01` and
  dispatches on the leading byte. Unknown tags return `UnknownTag`.
- `crates/agent-tunnel/src/listener.rs` (gateway): `ConnectRequest::
  new(...)` → `ConnectRequest::tcp(...)`.
- `devolutions-agent/src/tunnel.rs`: fields → accessor methods
  (`connect_msg.session_id()`, `.target()`, `.protocol_version()`).
- `tests/session.rs`: updated constructors; wire format lock-in
  accounts for the new tag byte (outer length bumps from 29 → 30 for
  the `host:80` sample). Added `decode_rejects_unknown_connect_
  request_tag` negative test.

Wire format delta: `ConnectRequest` gains a 1-byte tag prefix. The old
layout is unreachable from any shipped build (PR1 is the introducing
PR), so there is no compat break in practice.

Tests: 3 + 24 + 11 on agent-tunnel-proto; 20 agent-tunnel; 64 gateway
lib; 5 agent — all green. Zero clippy warnings, fmt clean.
`main.rs` is the service entry point — it shouldn't also be a home for
wire-format-adjacent types. `enrollment.rs` already owns the rest of
the enrollment flow (HTTP call, cert persistence), so `EnrollmentJwt
Claims` and `parse_enrollment_jwt` belong there.

- Move `EnrollmentJwtClaims` struct + `parse_enrollment_jwt` fn from
  `devolutions-agent/src/main.rs` to
  `devolutions-agent/src/enrollment.rs`. Both become `pub` (consumed
  by `main.rs` through the lib→bin boundary).
- Move the two parser-specific unit tests (`parse_enrollment_jwt_
  rejects_malformed`, `parse_enrollment_jwt_requires_gw_url`) into a
  new `#[cfg(test)]` block in `enrollment.rs`, alongside their own
  `make_jwt` helper. The `parse_up_command_args_accepts_enrollment_
  string` test stays in `main.rs` since it's exercising CLI parsing,
  not the JWT decoder itself.
- `main.rs` now does `use devolutions_agent::enrollment::parse_
  enrollment_jwt;` and drops its top-level `base64::Engine` import
  (only the test module still needs it, so the trait use moves into
  the test `mod tests`).
- Drive-by: collapse `anyhow::Result` / `anyhow::bail!` in
  enrollment.rs to the freshly-imported `Result` / `bail` so they are
  consistent across the module.

Tests: 27 lib + 3 bin on devolutions-agent, all green. Zero clippy
warnings on the agent crate. Fmt clean.
Addresses Benoit's review comment: "I see pull multiple libraries to
parse PEM files… Pretty sure we already had something before `pem` and
`rustls-pem`." Commit `b15e4a19` already dropped the `pem` crate; this
commit completes the consolidation by moving the remaining PEM and
X.509 parsing from `rustls-pemfile` / `x509-parser` onto `picky` — the
Devolutions stack already used elsewhere in the gateway for JWT +
cert building + PEM reading.

agent-tunnel `Cargo.toml`:

- Drop direct deps: `rustls-pemfile`, `x509-parser`.
- Add: `picky = { features = ["x509", "time_conversion"] }` and
  `picky-asn1-x509` (needed to name the `GeneralName::Uri` /
  `GeneralName::DnsName` variants when matching on
  `ExtensionView::SubjectAltName`). Both are already transitive deps
  of `picky` in the workspace.
- `rcgen` keeps its `x509-parser` feature because rcgen's
  `CertificateSigningRequestParams::from_pem` relies on it — but that
  becomes a purely transitive dep now, not something our code imports.

`crates/agent-tunnel/src/cert.rs`:

- `cert_pem_to_der`: `rustls_pemfile::certs` → `picky::pem::parse_pem`
  with an explicit `CERTIFICATE` label check.
- New helpers `read_cert_chain` and `read_private_key` replace
  `rustls_pemfile::{certs, private_key}` for `build_server_tls_config`.
  Both return the tagged `rustls::pki_types` types
  (`CertificateDer`, `PrivateKeyDer` variants) so the rustls builder
  remains unchanged. `read_private_key` accepts the same three PEM
  labels `rustls-pemfile` did (PKCS#8, PKCS#1, SEC1).
- `spki_sha256_from_der`: `x509_parser::parse_x509_certificate` →
  `picky::x509::Cert::from_der`, then `public_key().to_der()` to get
  the SubjectPublicKeyInfo bytes for SHA-256. Note the behavior is
  slightly better than before: x509-parser returned the raw
  pre-parsed bytes as they appeared in the cert; picky re-encodes,
  which is the canonical form. For a hash comparison the two match
  because the input is already canonical DER.
- `extract_agent_name_from_der`: picky's `DirectoryName::
  find_common_name`.
- `extract_agent_id_from_der`: iterate `cert.extensions()` looking for
  `ExtensionView::SubjectAltName`, then match on `GeneralName::Uri`
  and strip the `urn:uuid:` prefix.
- `check_server_cert`: picky's `Cert::from_der` + `valid_not_after()`
  (converted to `OffsetDateTime` via the `time_conversion` feature) +
  same SAN-iteration pattern for DNS name matching.

No behavior change. 20 agent-tunnel lib tests + 64 gateway lib tests
green. Zero clippy warnings; fmt clean.
@CBenoit Benoît Cortier (CBenoit) changed the title feat: QUIC agent tunnel — protocol, listener, agent client feat: initial implementation of QUIC agent tunnel Apr 21, 2026
@CBenoit Benoît Cortier (CBenoit) enabled auto-merge (squash) April 21, 2026 14:59
Copy link
Copy Markdown
Member

@CBenoit Benoît Cortier (CBenoit) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging now, we should follow up in subsequent PRs for the rest. Just need the conflict to be fixed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 43 out of 44 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +124 to +126
for candidate in targets.iter() {
let target_str = format!("{}:{}", candidate.host(), candidate.port());

Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When building target_str for ConnectRequest, using format!("{}:{}", candidate.host(), candidate.port()) will produce an invalid host:port string for IPv6 targets (missing []). Since TargetAddr already provides as_addr() (which includes brackets for IPv6), prefer that to avoid malformed targets being sent to the agent tunnel.

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +15
/// Validate a Bearer token as an enrollment JWT signed by the provisioner key.
///
/// Returns `true` if the token is a well-formed JWT whose signature verifies
/// against `provisioner_key`, whose `exp` has not passed, and whose `scope`
/// is `TunnelEnroll` (or `Wildcard`). Returns `false` for any failure.
///
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_enrollment_jwt uses JwtValidator::strict(now), which (per picky) also enforces the nbf claim in addition to exp. The doc comment currently only mentions exp; please either update the comment to mention that nbf is required/checked, or switch to a validator configuration that makes nbf optional if that’s the intended behavior for enrollment JWTs.

Copilot uses AI. Check for mistakes.
use super::*;

#[test]
fn parse_up_command_args_uses_default_config_path() {
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name is misleading: it doesn’t exercise any config-path behavior (no config path is parsed/used in parse_up_command_args). Rename the test to reflect what it actually asserts (e.g., that required flags are parsed and --advertise-routes is split correctly).

Suggested change
fn parse_up_command_args_uses_default_config_path() {
fn parse_up_command_args_parses_required_flags_and_splits_advertise_routes() {

Copilot uses AI. Check for mistakes.
Sync with master to pull latest changes. Only conflict was Cargo.lock,
regenerated via `cargo generate-lockfile`. Touched crates compile
cleanly.
CI runs `cargo +nightly fmt --all -- --check` with the workspace's
`group_imports = StdExternalCrate` rule, which stable rustfmt silently
ignores (unstable feature). The `use agent_tunnel::AgentTunnelHandle;`
line belonged in the external-crate group (with `anyhow`, `tokio`, etc.)
not after the `crate::*` group.
@CBenoit Benoît Cortier (CBenoit) merged commit d89dd0c into master Apr 21, 2026
41 checks passed
@CBenoit Benoît Cortier (CBenoit) deleted the feat/quic-tunnel-1-core branch April 21, 2026 16:44
irvingouj@Devolutions (irvingoujAtDevolution) added a commit that referenced this pull request Apr 21, 2026
Builds on #1738 (core infrastructure) to make the agent tunnel
production-ready for a DVLS-driven deployment. Not yet: agent
management webapp UI (follow-up PR) and Playwright E2E harness
(follow-up PR).

Transparent routing:

- `crates/agent-tunnel/src/routing.rs`: `RoutingDecision` pipeline —
  explicit `jet_agent_id` from the JWT → subnet match → domain
  suffix match (longest wins) → direct connect. Single `try_route`
  entry point consumed by all gateway proxy paths.
- `crates/agent-tunnel/src/registry.rs`: `find_agents_for(host)` +
  `RouteAdvertisementState::matches_target()` do the lookup in one
  spot; offline agents are skipped.
- Gateway proxy integration: `api/fwd.rs`, `api/kdc_proxy.rs`,
  `api/rdp.rs`, `rd_clean_path.rs`, `generic_client.rs`, `rdp_proxy.rs`
  all call `try_route` before falling through to direct TCP.
- Tests: `agent-tunnel/src/integration_test.rs` (2 full-stack QUIC
  E2E), `tests/agent_tunnel_registry.rs` (13), `tests/agent_tunnel_
  routing.rs` (8).

Agent-side certificate renewal:

- `enrollment.rs`: `is_cert_expiring(cert_path, threshold_days)` and
  `generate_csr_from_existing_key(key_path, agent_name)` — the key
  never changes across renewals, the gateway just signs a new cert
  with the same public key.
- `tunnel.rs`: on connect, if the cert is within 15 days of expiry,
  the agent sends a `CertRenewalRequest` control message with a new
  CSR, waits for `CertRenewalResponse::Success`, writes the renewed
  cert and CA, and reconnects.
- `agent-tunnel/src/listener.rs`: gateway-side handler signs the
  CSR via `CaManager::sign_agent_csr` and returns the new cert chain.
  (Stub replaced: master's handler emitted a debug log and dropped
  the message.)

QUIC endpoint override:

- `enrollment.rs`: new `quic_endpoint_override: Option<String>`
  parameter on `enroll_agent` — if set, overrides the endpoint
  returned by the enroll API. Needed because the gateway's
  `quic_endpoint` is derived from `conf.hostname`, which in Docker
  is often the container ID (not routable from outside).
- `main.rs`: new `--quic-endpoint` CLI flag and `jet_quic_endpoint`
  JWT claim; precedence is CLI flag > JWT claim > enroll API
  response.

Agent-side routing primitives:

- `tunnel_helpers.rs`: `Target::Ip` / `Target::Domain` enum parsed
  from the gateway's `ConnectRequest::target`, `resolve_target`
  (domain → DNS), `connect_to_target` (happy-eyeballs).

Deployment:

- `Dockerfile`: multi-stage build for the gateway. Produces an
  image that can run behind a DVLS-managed reverse proxy.
- `docker-compose.yml`: gateway + network setup for local dev.
- `.dockerignore` + `.gitignore` updates.

Tests: 22 agent-tunnel lib + 3 proto version + 24 proto control +
11 proto session + 13 registry + 8 routing integration + 64 gateway
lib, all green. Zero clippy warnings; nightly fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
irvingouj@Devolutions (irvingoujAtDevolution) added a commit that referenced this pull request Apr 21, 2026
Feature branch is the superset of all upcoming PRs:
- Base: master (PR1 merged as #1738)
- PR2 content (transparent routing, cert renewal, Docker): in place
- PR3 content (Windows MSI installer tunnel dialog, Linux entrypoint):
  overlaid from the old feature tip.
- PR4 content (gateway webapp agent UI + Playwright e2e): overlaid
  from the old feature tip.

Compiles clean. Not yet fmt-verified for the webapp side.
irvingouj@Devolutions (irvingoujAtDevolution) added a commit that referenced this pull request Apr 21, 2026
Builds on #1738 (core infrastructure). Follow-up PRs will add the
Windows/Linux installer integration, gateway webapp agent
management UI, Docker deployment, and Playwright E2E harness.

Transparent routing:

- `crates/agent-tunnel/src/routing.rs`: `RoutingDecision` pipeline —
  explicit `jet_agent_id` from the JWT → subnet match → domain
  suffix match (longest wins) → direct connect. Single `try_route`
  entry point consumed by all gateway proxy paths.
- `crates/agent-tunnel/src/registry.rs`: `find_agents_for(host)` +
  `RouteAdvertisementState::matches_target()` do the lookup in one
  spot; offline agents are skipped.
- Gateway proxy integration: `api/fwd.rs`, `api/kdc_proxy.rs`,
  `api/rdp.rs`, `rd_clean_path.rs`, `generic_client.rs`, `rdp_proxy.rs`
  all call `try_route` before falling through to direct TCP.
- Tests: `agent-tunnel/src/integration_test.rs` (2 full-stack QUIC
  E2E), `tests/agent_tunnel_registry.rs` (13), `tests/agent_tunnel_
  routing.rs` (8).

Agent-side certificate renewal:

- `enrollment.rs`: `is_cert_expiring(cert_path, threshold_days)` and
  `generate_csr_from_existing_key(key_path, agent_name)` — the key
  never changes across renewals, the gateway just signs a new cert
  with the same public key.
- `tunnel.rs`: on connect, if the cert is within 15 days of expiry,
  the agent sends a `CertRenewalRequest` control message with a new
  CSR, waits for `CertRenewalResponse::Success`, writes the renewed
  cert and CA, and reconnects.
- `agent-tunnel/src/listener.rs`: gateway-side handler signs the
  CSR via `CaManager::sign_agent_csr` and returns the new cert chain.
  (Stub replaced: master's handler emitted a debug log and dropped
  the message.)

QUIC endpoint override:

- `enrollment.rs`: new `quic_endpoint_override: Option<String>`
  parameter on `enroll_agent` — if set, overrides the endpoint
  returned by the enroll API. Needed because the gateway's
  `quic_endpoint` is derived from `conf.hostname`, which in a
  containerized deployment is often the container ID (not routable
  from outside).
- `main.rs`: new `--quic-endpoint` CLI flag and `jet_quic_endpoint`
  JWT claim; precedence is CLI flag > JWT claim > enroll API
  response.

Agent-side routing primitives:

- `tunnel_helpers.rs`: `Target::Ip` / `Target::Domain` enum parsed
  from the gateway's `ConnectRequest::target`, `resolve_target`
  (domain → DNS), `connect_to_target` (happy-eyeballs).

Tests: 22 agent-tunnel lib + 3 proto version + 24 proto control +
11 proto session + 13 registry + 8 routing integration + 64 gateway
lib, all green. Zero clippy warnings; nightly fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants