Running OpenShell on a Mac Mini M4 (Docker Desktop) to sandbox coding agents, same setup I flagged in #745. Ran into an adjacent gap in what actually gets enforced.
Binary-level policy (seccomp, Landlock) allows rm, dd, mkfs, shred for legit use. That's fine. The problem isn't running them individually, it's chaining them in a single exec string:
rm -rf /data && dd if=/dev/zero of=/dev/sda
That ships as one command. Each binary is allowed. The chain is what does the damage.
&& is the one I've actually hit. ; and stuff like find ... | xargs rm show up too. Subshells/backticks are possible, but I haven't seen those in practice. Landlock and seccomp only see individual execve calls, so none of this composition is visible at that layer. It all exists at the shell level.
Want a policy that looks at the raw command string before execve, and blocks when multiple destructive primitives are chained together. Then let operators carve out exemptions where that pattern is actually intentional.
Simple string inspection with awareness of chaining operators is probably enough for this class of issue. Full AST parsing feels like overkill and fragile. A small set of dangerous primitives (rm, dd, mkfs, shred, maybe fdisk, diskutil) covers most of the high-risk cases.
This is a guardrail, not perfect detection. Some legit workflows will trip it, and that's what exemptions are for.
#745 covers visibility into which layers are active. This is the adjacent gap: what gets enforced semantically, while the full command string still exists pre-execve.
Not sure where this should live. I see openshell-sandbox/data/sandbox-policy.rego, YAML policies via openshell-prover, and Rust-level filters in the sandbox crate. Would want direction before picking a lane.
Running OpenShell on a Mac Mini M4 (Docker Desktop) to sandbox coding agents, same setup I flagged in #745. Ran into an adjacent gap in what actually gets enforced.
Binary-level policy (seccomp, Landlock) allows
rm,dd,mkfs,shredfor legit use. That's fine. The problem isn't running them individually, it's chaining them in a single exec string:That ships as one command. Each binary is allowed. The chain is what does the damage.
&&is the one I've actually hit.;and stuff likefind ... | xargs rmshow up too. Subshells/backticks are possible, but I haven't seen those in practice. Landlock and seccomp only see individual execve calls, so none of this composition is visible at that layer. It all exists at the shell level.Want a policy that looks at the raw command string before execve, and blocks when multiple destructive primitives are chained together. Then let operators carve out exemptions where that pattern is actually intentional.
Simple string inspection with awareness of chaining operators is probably enough for this class of issue. Full AST parsing feels like overkill and fragile. A small set of dangerous primitives (
rm,dd,mkfs,shred, maybefdisk,diskutil) covers most of the high-risk cases.This is a guardrail, not perfect detection. Some legit workflows will trip it, and that's what exemptions are for.
#745 covers visibility into which layers are active. This is the adjacent gap: what gets enforced semantically, while the full command string still exists pre-execve.
Not sure where this should live. I see
openshell-sandbox/data/sandbox-policy.rego, YAML policies viaopenshell-prover, and Rust-level filters in the sandbox crate. Would want direction before picking a lane.