fix(sandbox): rewrite messaging credential placeholders#1286
Conversation
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
All contributors have signed the DCO ✍️ ✅ |
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
WebSocket implementation alignment noteThis PR follows common hardening patterns used by established WebSocket clients, servers, and proxy implementations, while keeping the implementation tailored to OpenShell's credential-boundary model. The important alignment points are behavioral:
This matters because OpenShell is not trying to become a general WebSocket application framework. The purpose here is narrower: preserve the no-secret-in-sandbox invariant, then resolve placeholders only at the OpenShell relay boundary for opted-in client-to-server text frames. The safest shape is therefore a small, explicit protocol surface: if the relay can validate and process the negotiated WebSocket state, it proceeds; if not, it closes before forwarding an unsafe upgraded flow. The remaining confidence-building work is conformance and durability, not feature expansion. The follow-up path is to keep the in-tree RFC regression matrix, maintain the Docker-backed WebSocket conformance e2e lane, and consider an Autobahn-style relay harness or parser fuzzing after this PR lands. |
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Follow-up hardening passThis pass applies common WebSocket proxy hardening patterns to the OpenShell relay: parse negotiation as structured protocol data, validate server-selected options against the original client offer, canonicalize only the supported safe extension forms, and fail closed when the rewrite path sees parameters it does not explicitly support. Concretely, This matters for credential rewriting because the proxy becomes a protocol participant after it mutates an upgraded flow. Loose negotiation can let the upstream select compression state or subprotocol behavior the relay did not offer or understand, which creates room for state confusion, fail-open compression handling, or secret-bearing payloads moving through a path that no longer has the invariants the rewriter depends on. The new behavior keeps the supported surface narrow and deterministic: either the relay can validate and safely process the negotiated shape, or the handshake fails before forwarding the 101 to the client. Remaining conformance work is still useful, especially an Autobahn-style relay harness and broader fuzz/property coverage for handshake parsing and frame sequencing. Those should come after this focused spec pass so future parser or engine swaps have a stable regression matrix to preserve. |
|
I have read the DCO document and I hereby sign the DCO. |
|
I have read the Contributor Agreement including DCO and I hereby sign the Contributor Agreement and DCO |
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Focused conformance matrix addedI added
This is not a replacement for Autobahn. It is the closest low-cost regression layer for this PR because it covers the specific OpenShell behavior boundary: once the proxy becomes an active WebSocket participant for credential rewriting, the handshake assumptions, compression negotiation, and post-upgrade frame relay need to be tested together rather than only as isolated parser/frame unit tests. Full Autobahn remains useful as a follow-up CI or manual conformance job, but it would add Docker/runtime/tooling weight beyond this focused PR pass. |
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
Progress update: policy configuration now treats WebSocket as a first-class inspected protocol in the incremental update flow. What changed:
Why it matters: Validation:
|
…dential-rewrite Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
Progress update: WebSocket credential replacement is now configurable from the incremental policy workflow and covered through the route-selected relay path. What changed:
Why it matters: Validation:
|
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
Follow-up update after the policy CLI shape review:
openshell policy update demo \
--add-endpoint realtime.example.com:443:read-write:websocket:enforce:websocket-credential-rewrite \
--binary /usr/bin/nodeRationale: Message-content policy would buy a different class of control than this PR currently implements: it would let OpenShell authorize WebSocket messages by application intent inside the text payload, such as event type, channel, action, model/tool name, or service-specific JSON shape, instead of only by host/port, upgrade path, and Validation after this update:
All passed locally. |
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
Progress update: GraphQL-over-WebSocket semantic policy pass Implemented in
Validation run:
Maintainer decisions still explicit:
|
…dential-rewrite Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
I found two P1 issues that should be addressed before this can merge.
These are both security-sensitive fail-closed/policy-enforcement issues, so I would treat them as blockers for this PR. |
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Maintainer feedback follow-upAddressed the two P1 blockers in What changed:
Validation:
The commit includes DCO signoff. |
Signed-off-by: Aaron Erickson <aerickson@nvidia.com> # Conflicts: # crates/openshell-driver-docker/src/lib.rs
Summary
Fixes #872 by making OpenShell the relay-boundary credential substitution point for messaging APIs. The PR keeps the opt-in WebSocket text-frame rewrite path and adds REST request-body credential rewrite for inspected REST endpoints.
Provider-managed credentials stay out of sandbox env values, argv, config files, logs, and sandbox-authored placeholder strings. OpenShell resolves them only at explicitly configured inspected egress paths, then fails closed if a reserved placeholder or provider-shaped alias remains unresolved.
Related Issue
Fixes #872
Reviewer Map
proto/sandbox.proto,crates/openshell-policy,crates/openshell-cli,crates/openshell-providers,crates/openshell-server,docs/reference/policy-schema.mdx,docs/sandboxes/policies.mdx, anddocs/security/best-practices.mdx.crates/openshell-sandbox/src/secrets.rs,proxy.rs,l7/rest.rs,l7/relay.rs, andl7/websocket.rs.crates/openshell-sandbox/src/l7/websocket.rs,l7/graphql.rs,l7/mod.rs,l7/relay.rs,l7/rest.rs,opa.rs, andsandbox-policy.rego.crates/openshell-sandboxtests,e2e/rust/tests/websocket_conformance.rs,tasks/test.toml, and.github/workflows/websocket-conformance.yml.Changes
openshell:resolve:env:KEYplaceholder contract and revisioned forms.OPENSHELL-RESOLVE-ENV-KEY, such asprovider-OPENSHELL-RESOLVE-ENV-API_TOKEN, for provider/env resolution.request_body_credential_rewrite, defaulting tofalse.request-body-credential-rewrite, accepted only forprotocol: rest.application/json,application/x-www-form-urlencoded, andtext/*REST request bodies when body rewrite is enabled.application/x-www-form-urlencodedfields with standard+and percent-decoding rules, rewrite placeholders in decoded values, re-encode the values, and recomputeContent-Length.Content-Lengthafter REST body rewrite.permessage-deflatenegotiation, raw opt-out behavior, GraphQL-over-WebSocket operation policy, and no-payload/no-secret logging.Security and Spec Notes
OPENSHELL-RESOLVE-ENValiases fail closed on paths that inspect credentials.Testing
Focused validation run on pushed head
59a51603713fa0c2b1c05cad02050daa83037925:cargo fmtcargo test -p openshell-cli policy_updatecargo test -p openshell-policycargo test -p openshell-sandbox request_bodycargo test -p openshell-sandbox forward_relaycargo test -p openshell-sandbox credentialcargo test -p openshell-sandbox websocketcargo test -p openshell-sandbox l7cargo test -p openshell-server summarize_cli_policy_merge_opcargo build -p openshell-cli --bin openshellcargo clippy -p openshell-sandbox --all-targets -- -D warningscargo clippy -p openshell-sandbox -p openshell-policy --all-targets -- -D warningsmise run docsmise run python:protomise run pre-commitAutobahn was not run because the repo does not currently include an Autobahn harness. This PR adds focused in-tree WebSocket protocol and compression regression coverage plus a dedicated Docker-backed WebSocket conformance e2e lane; the workflow is manual today with a maintainer decision below on scheduling.
Remaining Maintainer Decisions
.github/workflows/websocket-conformance.ymlinto scheduled nightly coverage after manual burn-in.access: fullis used.Checklist