Skip to content

feat(k8s): Support ImageVolumeSource for supervisor binary sideload #1299

@mrunalp

Description

@mrunalp

feat(k8s): Support ImageVolumeSource for supervisor binary sideload

Problem Statement

The Kubernetes driver delivers the supervisor binary into sandbox pods using an
initContainer + emptyDir pattern: an init container runs copy-self to write
the binary from the supervisor image into an emptyDir volume, then the agent
container mounts it read-only. This approach has three drawbacks:

  1. Startup latency — the init container must complete before the agent
    container can start, adding a sequential copy step to every sandbox launch.
  2. Complexity — three moving parts (emptyDir volume, init container with
    copy-self subcommand, volume mount) make the pod spec harder to debug.
  3. Redundancy — Kubernetes ImageVolumeSource (GA in v1.36) can mount an OCI
    image directly as a read-only volume, making the copy step unnecessary.

Proposed Design

Add a SupervisorSideloadMethod enum with two variants and thread it through
the driver config, CLI args, Helm chart, and pod template construction:

  • image-volume (default) — emits a Kubernetes image volume source that
    mounts the supervisor OCI image directly. No init container or emptyDir
    needed. The binary is available at the same path
    (/opt/openshell/bin/openshell-sandbox) because the supervisor image has
    /openshell-sandbox at its root, and the image volume is mounted at
    /opt/openshell/bin.

  • init-container (fallback) — preserves the existing emptyDir + init
    container pattern for clusters running Kubernetes < v1.33.

Components changed

File Change
crates/openshell-driver-kubernetes/src/config.rs SupervisorSideloadMethod enum with FromStr/Display/Default
crates/openshell-driver-kubernetes/src/driver.rs supervisor_image_volume() helper, branch in apply_supervisor_sideload()
crates/openshell-driver-kubernetes/src/main.rs --supervisor-sideload-method CLI arg / OPENSHELL_SUPERVISOR_SIDELOAD_METHOD env var
crates/openshell-server/src/lib.rs Wire env var into server config
deploy/helm/openshell/values.yaml supervisor.sideloadMethod value
deploy/helm/openshell/templates/statefulset.yaml Pass env var to driver

Volume JSON emitted (image-volume)

{
  "name": "openshell-supervisor-bin",
  "image": {
    "reference": "openshell/supervisor:latest",
    "pullPolicy": "IfNotPresent"
  }
}

What stays the same

Agent container modifications (command override to
/opt/openshell/bin/openshell-sandbox, runAsUser: 0, read-only volume mount)
are identical for both methods. No changes to k8s-openapi dependency — volumes
are built as raw serde_json::json!().

Alternatives Considered

  1. Switch entirely to ImageVolumeSource, drop initContainer — simpler but
    would break clusters running Kubernetes < v1.33. The fallback path costs
    almost nothing to maintain since the existing helper functions are unchanged.

  2. Auto-detect via API server version — query the cluster version at runtime
    and choose automatically. Rejected because it adds complexity, and operators
    should make an explicit choice about which K8s features their cluster
    supports. A misconfigured image-volume on an older cluster produces a clear
    Kubernetes API error.

  3. Use subPath to mount only the binarysubPath for image volumes was
    added in the v1.33 beta. Not needed because mounting the entire supervisor
    image at /opt/openshell/bin already places the binary at the correct path,
    and the supervisor image is FROM scratch with only the binary.

Agent Investigation

  • Explored crates/openshell-driver-kubernetes/src/driver.rs — supervisor
    sideload is implemented in apply_supervisor_sideload() (line 741) with
    helpers supervisor_volume(), supervisor_init_container(), and
    supervisor_volume_mount(). All volumes are built as raw JSON via
    serde_json::json!(), so no k8s-openapi upgrade is needed.
  • Confirmed the supervisor image places the binary at /openshell-sandbox
    (filesystem root). Mounting the image volume at /opt/openshell/bin makes
    the binary available at /opt/openshell/bin/openshell-sandbox — same path
    as the initContainer approach.
  • Verified k8s-openapi is at v0.21.1 with v1_26 feature and does not
    include ImageVolumeSource types, confirming the raw JSON approach is
    necessary.
  • Tested end-to-end on a local Kubernetes v1.37 / CRI-O 1.36 cluster. Sandbox
    pod reached 1/1 Running with the image volume source and no supervisor init
    container. See architecture/plans/image-volume-sideload-testing.md for full
    test procedure and results.

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

Labels

area:sandboxSandbox runtime and isolation work

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions