Skip to content

Releases: dstackai/dstack-enterprise

0.20.24-v1

11 Jun 14:10

Choose a tag to compare

Dev environments

Zed

dstack now supports Zed as a dev environment IDE:

type: dev-environment
ide: zed
resources:
  gpu: L4

Once the dev environment is up, the CLI prints a zed:// link that opens the remote project in Zed over SSH. Since Zed doesn't require any plugins, no server pre-installation is needed — the Zed server is installed automatically on first connect.

✗ dstack apply
...
Submit a new run? [y/n]: y
 NAME                     BACKEND                  GPU                     PRICE       STATUS      SUBMITTED
 fast-fly-1               aws (us-east-2)          gpu=L4:24GB:1           $0.1838     running     16:36
                                                                           (spot)

fast-fly-1 provisioning completed (running)
pip install ipykernel...

To open in Zed, use link below:

  zed://ssh/fast-fly-1/dstack/run

To connect via SSH, use: `ssh fast-fly-1`

To exit, press Ctrl+C.

Services

Replica groups

The spot_policy and reservation properties can now be specified at the replica group level. This allows distributing replicas across reserved and spot capacity, e.g., running baseline replicas on a reservation while autoscaling overflow replicas on spot instances:

type: service
image: my-image
port: 80

replicas:
  - name: baseline
    reservation: my-reservation
    count: 1

  - name: overflow
    spot_policy: auto
    count: 0..3
    scaling:
      metric: rps
      target: 1

Shepherd Model Gateway

Services using Shepherd Model Gateway now support gRPC communication with both vLLM and SGLang workers. Previously, only the SGLang runtime with the HTTP connection mode was supported.

Below is an example service configuration running vLLM gRPC workers:

type: service
name: prefill-decode

env:
  - HF_TOKEN
  - MODEL_ID=zai-org/GLM-4.5-Air-FP8

replicas:
  - count: 1
    image: python:3.12-slim
    commands:
      - pip install smg
      - |
          smg launch \
            --pd-disaggregation \
            --model-path $MODEL_ID \
            --enable-igw \
            --host 0.0.0.0 \
            --port 8000 \
            --prefill-policy cache_aware
    router:
      type: sglang
    resources:
      cpu: 4

  - count: 1
    image: vllm/vllm-openai:latest
    commands:
      - pip install -U "vllm[grpc]"
      - |
          python3 -m vllm.entrypoints.grpc_server \
            --model $MODEL_ID \
            --host 0.0.0.0 \
            --port 8000 \
            --kv-transfer-config '{"kv_connector":"NixlConnector","kv_role":"kv_producer"}'
    resources:
      gpu: H200

  - count: 1
    image: vllm/vllm-openai:latest
    commands:
      - pip install -U "vllm[grpc]"
      - |
          python3 -m vllm.entrypoints.grpc_server \
            --model $MODEL_ID \
            --host 0.0.0.0 \
            --port 8000 \
            --kv-transfer-config '{"kv_connector":"NixlConnector","kv_role":"kv_consumer"}'
    resources:
      gpu: H200

port: 8000

dstack automatically detects each worker's runtime (vLLM or SGLang) and connection mode (HTTP or gRPC) by probing it. With gRPC, the SMG router tokenizes requests once and routes on tokens instead of raw text, reducing duplicate work and making cache_aware routing more effective.

JarvisLabs

The jarvislabs backend now supports offers with RTXPRO6000 GPUs.

Azure

subnet_ids

Similarly to vpc_ids, the azure backend now allows selecting specific subnets to be attached to dstack VMs via the new subnet_ids property, mapping regions to subnets in the <resource-group>/<vnet>/<subnet> format:

projects:
  - name: main
    backends:
      - type: azure
        subscription_id: ...
        tenant_id: ...
        creds:
          type: default
        regions: [westeurope]
        subnet_ids:
          westeurope: my-resource-group/my-vnet/my-subnet

This is useful when the VNet contains subnets that dstack shouldn't pick automatically, e.g. subnets delegated to other Azure services.

What's changed

Full changelog: dstackai/dstack@0.20.23...0.20.24

0.20.23-v1

04 Jun 10:29

Choose a tag to compare

This release includes several bug fixes and performance optimizations.

What's Changed

New Contributors

Full Changelog: dstackai/dstack@0.20.22...0.20.23

0.20.22-v1

28 May 10:14

Choose a tag to compare

Backends

VastAI

The vastai backend gets new backend-specific options in run and fleet configurations for advanced offers filtering:

type: dev-environment
backend_options:
- type: vastai
  offer_order: price
  min_reliability: 0.97
  min_score: 250

See the YAML reference for more details on new backend_options.

Accelerators

Tenstorrent

The update adds support for Tenstorrent Blackhole accelerators, including PCIe cards and systems such as LoudBox, QuietBox, and Galaxy. Previously dstack supported only Tenstorrent Wormhole accelerators. Also, we've reworked the Tenstorrent example.

Examples

A new Miles example shows how to use dstack and Miles for reinforcement learning (RL) post-training of a 32B language model with GRPO across a multi-node cluster.

Breaking changes

  • Dropped support for AWS P3 instances (V100).

What's Changed

Full Changelog: dstackai/dstack@0.20.21...0.20.22

0.20.21-v2

25 May 10:32

Choose a tag to compare

This release fixes a bug when instance provisioning may get stuck due to errors with placement group reuse (#3905).

0.20.21-v1

21 May 13:12

Choose a tag to compare

Backends

JarvisLabs

This release adds JarvisLabs as a new backend, allowing dstack to provision GPU and CPU VMs on JarvisLabs, including spot GPU instances.

To configure the backend, log into your JarvisLabs account, create an API key, and add it to ~/.dstack/server/config.yml:

projects:
- name: main
  backends:
    - type: jarvislabs
      creds:
        type: api_key
        api_key: ...

Kubernetes

Multiple clusters

A single kubernetes backend can now manage multiple Kubernetes clusters. Each cluster is selected via a kubeconfig context and becomes its own dstack region:

projects:
- name: main
  backends:
  - type: kubernetes

    kubeconfig:
      filename: ~/.kube/config

    contexts:
    - name: gpu-cluster-a
    - name: gpu-cluster-b

Each context can configure its own proxy_jump.hostname and proxy_jump.port, and the namespace is taken from each kubeconfig context. When creating a dstack volume or gateway, the region field selects which cluster the resource is provisioned in.

The previous single-cluster configuration (without contexts) continues to work but is no longer recommended and may be removed in the future. Refer to the backends docs for the up-to-date configuration and migration guidance.

Object labeling

All dstack-managed Kubernetes resources (jump pods, job pods, gateways, volumes, registry-auth secrets, services) now share a consistent set of labels, making it easier to filter and audit dstack resources with kubectl:

  • app.kubernetes.io/name=dstack-{ssh-proxy,job,gateway,volume}
  • app.kubernetes.io/instance
  • app.kubernetes.io/managed-by=dstack
  • k8s.dstack.ai/project
  • k8s.dstack.ai/name (if applicable)
  • k8s.dstack.ai/user (if applicable)

Bug fixes

  • Jobs no longer retry indefinitely when the target fleet is at capacity.
  • Negative retry.duration values (e.g. -1) are now rejected during configuration parsing instead of silently producing a nonsensical retry spec.

What's changed

Full changelog: dstackai/dstack@0.20.20...0.20.21

0.20.20-v1

15 May 11:47

Choose a tag to compare

Services

NVIDIA Dynamo

This update adds support for Prefill-Decode (PD) disaggregated inference with NVIDIA Dynamo.

Previously, dstack supported PD disaggregation only with Shepherd Model Gateway as the router and SGLang as the inference engine for workers. With this update, a replica group can declare router: { type: dynamo }, allowing workers to use inference engines such as SGLang, vLLM, or TensorRT-LLM.

type: service
name: dynamo-pd

env:
  - HF_TOKEN
  - MODEL_ID=zai-org/GLM-4.5-Air-FP8

replicas:
  - count: 1
    docker: true
    commands:
      - apt-get update
      - apt-get install -y python3-dev python3-venv
      - python3 -m venv ~/dyn-venv
      - source ~/dyn-venv/bin/activate
      - pip install -U pip
      - pip install "ai-dynamo[sglang]==1.1.1"
      - git clone https://github.com/ai-dynamo/dynamo.git
      # Brings up the NATS / etcd compose stack and runs the Dynamo HTTP frontend.
      - docker compose -f dynamo/deploy/docker-compose.yml up -d
      - |
        python3 -m dynamo.frontend \
          --http-host 0.0.0.0 --http-port 8000 \
          --discovery-backend etcd --router-mode kv \
          --kv-cache-block-size 64
    resources:
      cpu: 4
    router:
      type: dynamo

  - count: 1..4
    scaling:
      metric: rps
      target: 3
    python: "3.12"
    nvcc: true
    commands:
      # dstack injects DSTACK_ROUTER_INTERNAL_IP after the router replica
      # is provisioned. Compose the etcd/NATS endpoints from it.
      - export ETCD_ENDPOINTS="http://$DSTACK_ROUTER_INTERNAL_IP:2379"
      - export NATS_SERVER="nats://$DSTACK_ROUTER_INTERNAL_IP:4222"
      # Set to enable /health endpoint required by dstack probes.
      - export DYN_SYSTEM_PORT="8000"
      # Wait until the router's etcd and NATS ports are actually accepting connections.
      - |
        until (echo > /dev/tcp/$DSTACK_ROUTER_INTERNAL_IP/2379) 2>/dev/null \
           && (echo > /dev/tcp/$DSTACK_ROUTER_INTERNAL_IP/4222) 2>/dev/null; do
          echo "waiting for etcd/NATS on $DSTACK_ROUTER_INTERNAL_IP..."; sleep 3
        done
      - pip install "ai-dynamo[sglang]==1.1.1"
      - |
        python3 -m dynamo.sglang \
          --model-path $MODEL_ID --served-model-name $MODEL_ID \
          --discovery-backend etcd --host 0.0.0.0 \
          --page-size 64 \
          --disaggregation-mode prefill --disaggregation-transfer-backend nixl
    resources:
      gpu: H200

  - count: 1..8
    scaling:
      metric: rps
      target: 2
    python: "3.12"
    nvcc: true
    commands:
      - export ETCD_ENDPOINTS="http://$DSTACK_ROUTER_INTERNAL_IP:2379"
      - export NATS_SERVER="nats://$DSTACK_ROUTER_INTERNAL_IP:4222"
      - export DYN_SYSTEM_PORT="8000"
      - |
        until (echo > /dev/tcp/$DSTACK_ROUTER_INTERNAL_IP/2379) 2>/dev/null \
           && (echo > /dev/tcp/$DSTACK_ROUTER_INTERNAL_IP/4222) 2>/dev/null; do
          echo "waiting for etcd/NATS on $DSTACK_ROUTER_INTERNAL_IP..."; sleep 3
        done
      - pip install "ai-dynamo[sglang]==1.1.1"
      - |
        python3 -m dynamo.sglang \
          --model-path $MODEL_ID --served-model-name $MODEL_ID \
          --discovery-backend etcd --host 0.0.0.0 \
          --page-size 64 \
          --disaggregation-mode decode --disaggregation-transfer-backend nixl
    resources:
      gpu: H200

port: 8000
model: zai-org/GLM-4.5-Air-FP8

# Custom probe is required for PD disaggregation.
probes:
  - type: http
    url: /health
    interval: 15s

dstack provisions the router replica, injects DSTACK_ROUTER_INTERNAL_IP into non-router replicas, and lets Dynamo workers connect directly to the router’s etcd and NATS services.

Refer to the Dynamo example for full deployment instructions.

Replica groups

It's now possible to configure the image, docker, python, nvcc, and privileged properties at the replica group level. This enables complex multi-component services like NVIDIA Dynamo, where different replicas require different runtime environments.

Exports

Gateways

Gateways can now be exported and shared across projects, enabling centralized gateway management in multi-project setups.

$ dstack export --project main create my-export --gateway shared-gateway --importer team
 NAME       FLEETS  GATEWAYS        IMPORTERS 
 my-export  -       shared-gateway  team      

Now, if you list gateways in the team project, you'll see the exported gateway:

$ dstack gateway --project team
 NAME                 BACKEND          HOSTNAME        DOMAIN                 DEFAULT  STATUS  
 main/shared-gateway  aws (eu-west-1)  108.131.126.35  gtw.mycompany.example           running

Additionally, gateway domains now support optional project name interpolation using ${{ run.project_name }}, allowing different projects to use different domains on the same shared gateway.

type: gateway
name: shared-gateway

backend: aws
region: eu-west-1

domain: ${{ run.project_name }}.mycompany.example

Global exports

Users with global admin privileges can now export SSH fleets and gateways to all projects at once, enabling organization-wide resource sharing.

$ dstack export create global-export --gateway shared-gateway --global
 NAME           FLEETS  GATEWAYS        IMPORTERS
 global-export  -       shared-gateway  *

AWS

EFA clusters

Previously, fleets that used EFA (Elastic Fabric Adapter) with multiple network interfaces required public_ips: False. With this release, dstack allows creating such fleets with public IPs. This simplifies the use of interconnected clusters on AWS by removing the need to run the dstack server and CLI inside a private VPC.

Kubernetes

Backend configuration

The namespace property of the kubernetes backend configuration is now formally deprecated. It still takes effect and remains the source of truth in this version, but future versions will read the namespace from the current kubeconfig context instead.

Migration guide

Migration guide

  • If namespace is unset or set to default in both the backend config and the kubeconfig, no action is required — default continues to be used.
  • If namespace is set to the same value (e.g. ns-a) in both the backend config and the kubeconfig, no action is required.
  • If namespace is set to ns-a in the backend config but the kubeconfig has a different value (or none), set the namespace to ns-a in your kubeconfig context to prepare for future versions.
  • It is only safe to remove namespace from the backend config if its value is default.

What's changed

Read more

0.20.19-v1

30 Apr 11:02

Choose a tag to compare

Services

RPS window for autoscaling

Services now support a window property in the scaling spec that defines the time window used to calculate RPS. Allowed values are 30s, 1m, and 5m (default is 1m). Previously, the RPS was always calculated using a 1m window.

type: service
image: nginx
port: 80

replicas: 0..1
scaling:
  metric: rps
  # 1 request per second, calculated over a 5-minute window
  target: 1
  window: 5m

Kubernetes

registry_auth

The kubernetes backend now supports the registry_auth property for pulling Docker images from private registries:

type: service
image: nvcr.io/nim/deepseek-ai/deepseek-r1-distill-llama-8b
registry_auth:
  username: $oauthtoken
  password: ${{ secrets.ngc_api_key }}

dstack automatically creates and sets up imagePullSecrets for the pods. This requires new permissions for the Kubernetes role:

rules:
  resources: ["secrets"]
  verbs: ["create", "delete"]

Read-only volumes

Kubernetes volume configurations now support a new read_only property. When set to true, it enforces readOnly: true in the pod's volumeMounts.

type: volume
backend: kubernetes
name: my-volume
size: 100GB
read_only: true

Server

Faster processing

The server has been optimized to reduce processing latencies. As a result, many operations now take less time: run provisioning is up to 14s faster and run termination is up to 7s faster.

Examples

Documentation and examples have been refreshed, including a new Qwen3.6-27B and DeepSeek V4 examples. A new prefill-decode blog post shows how to run SGLang PD disaggregation via Shepherd Model Gateway.

Breaking changes

Python 3.9 support dropped

Running dstack on Python 3.9 is no longer supported, as Python 3.9 reached end-of-life on 2025-10-31. Please upgrade to Python 3.10 or later.

What's Changed

Full Changelog: dstackai/dstack@0.20.18...0.20.19

0.20.18-v1

23 Apr 14:56

Choose a tag to compare

CLI

For VM-based backends as well as SSH fleets, the CLI now shows Docker image pull progress in the format <extracted>/<downloaded>/<total>.

Offers

This update reduces the time required to fetch backend offers and initialize backends, making both dstack offer and dstack apply faster:

- runpod — 0.66s => 0.03s (22x)
- amddevcloud — 2.26s => 0.85s (2.7x)
- cudo — 2.48s => 1.02s (2.4x)
- verda — 3.27s => 1.74s (1.9x)
- lambda — 3.24s => 1.89s (1.7x)
- vastai — 3.27s => 1.77s (1.8x)
- gcp — 3.74s => 2.54s (1.5x)
- azure — 5.83s => 3.11s (1.9x)
- aws — 6.58s => 3.56s (1.8x)

Secrets

The Manager project role can now manage secrets if the allow_managers_manage_secrets property is enabled in the server’s default_permissions config:

default_permissions:
  allow_managers_manage_secrets: true

Previously, only the Admin role was allowed to manage secrets.

GPUs

This update adds support for GeForce RTX 2, 3, 4, and 5 series GPUs, which were previously not detected properly across both backend and SSH fleets.

GCP

The gcp backend now requires the compute.projects.get permission. Make sure this permission is granted to any custom IAM roles used by dstack.

What's changed

Full changelog: dstackai/dstack@0.20.17...0.20.18

0.20.17-v1

16 Apr 12:47

Choose a tag to compare

PD disaggregation

This update simplifies running SGLang with Prefill-Decode disaggregation.

Previously, PD disaggregation required configuring router on the gateway, which meant
the gateway had to run in the same cluster as the service to communicate with service
replicas.

With this update, router is configured on a service replica group instead. This allows
using a standard gateway outside the service cluster.

Below is an example service configuration for running zai-org/GLM-4.5-Air-FP8 using replica groups:

type: service
name: prefill-decode
image: lmsysorg/sglang:latest

env:
  - HF_TOKEN
  - MODEL_ID=zai-org/GLM-4.5-Air-FP8

replicas:
  - count: 1
    commands:
      - pip install sglang_router
      - |
        python -m sglang_router.launch_router \
          --host 0.0.0.0 \
          --port 8000 \
          --pd-disaggregation \
          --prefill-policy cache_aware
    router:
      type: sglang
    resources:
      cpu: 4

  - count: 1..4
    scaling:
      metric: rps
      target: 3
    commands:
      - |
        python -m sglang.launch_server \
          --model-path $MODEL_ID \
          --disaggregation-mode prefill \
          --disaggregation-transfer-backend nixl \
          --host 0.0.0.0 \
          --port 8000 \
          --disaggregation-bootstrap-port 8998
    resources:
      gpu: H200

  - count: 1..8
    scaling:
      metric: rps
      target: 2
    commands:
      - |
        python -m sglang.launch_server \
          --model-path $MODEL_ID \
          --disaggregation-mode decode \
          --disaggregation-transfer-backend nixl \
          --host 0.0.0.0 \
          --port 8000
    resources:
      gpu: H200

port: 8000
model: zai-org/GLM-4.5-Air-FP8

# Custom probe is required for PD disaggregation.
probes:
  - type: http
    url: /health
    interval: 15s

Note: this setup requires the service fleet or cluster to provide a CPU node for the
router replica.

Kubernetes

The kubernetes backend adds support for both network and instance volumes.

Network volumes

You can either create a new network volume or register an existing one. To create a new
network volume, specify size and optionally storage_class_name and/or
access_modes:

type: volume
backend: kubernetes
name: my-volume

size: 100GB

This automatically creates a PersistentVolumeClaim and associates it with the volume.

If you don't specify storage_class_name, the decision is delegated to the
DefaultStorageClass admission controller, if enabled.

If you don't specify access_modes, it defaults to [ReadWriteOnce]. To attach
volumes to multiple runs at the same time, set it to [ReadWriteMany] or
[ReadWriteMany, ReadOnlyMany].

To reuse an existing PersistentVolumeClaim, specify its name in claim_name:

type: volume
backend: kubernetes
name: my-volume

claim_name: existing-pvc

Once a volume configuration is applied, you can attach it to your runs via volumes:

type: dev-environment
name: vscode-vol

ide: vscode

volumes:
  - name: my-volume
    path: /volume_data

Instance volumes

In addition to network volumes, the kubernetes backend now supports instance volumes:

type: dev-environment
name: vscode-vol

ide: vscode

volumes:
  - instance_path: /mnt/volume
    path: /volume_data

Unlike network volumes, which persist across instances, instance volumes persist data
only within a particular instance. They are useful for storing caches or when you
manually mount a shared filesystem into the instance path.

Note: using volumes with the kubernetes backend requires the corresponding
permissions
.

Performance

Fetching backend offers for the first time has been optimized and is now much faster. As
a result, dstack apply, dstack offer, and the offers UI are all more responsive.
Here are the improvements for some of the major backends:

- aws — 41.43s => 6.61s (6.3x)
- azure — 12.49s => 5.50s (2.3x)
- gcp — 13.51s => 5.20s (2.6x)
- nebius — 10.74s => 3.80s (2.8x)
- runpod — 9.36s => 0.09s (104x)
- verda — 9.49s => 2.33s (4.1x)

Fleets

In-place update

Backend fleets now support initial in-place updates. You can update nodes,
reservation, tags, resources, backends, regions, availability_zones,
instance_types, spot_policy, and max_price without re-creating the entire fleet.
If existing idle instances do not match the updated configuration, dstack replaces
them.

Default resources

Fleets used to have default resources set to cpu=2.. mem=8GB.. disk=100GB.. when
left unspecified. This meant any offers with fewer resources were excluded from such
fleets. If you wanted to run on a mem=4GB VM, you had to specify resources in both
the run and fleet configurations.

Now fleets have no default resources, so all offers are available by default. If you
need to add extra constraints on which offers can be provisioned in a fleet, specify
resources explicitly.

Run configurations continue to have default minimum resources set to
cpu=2.. mem=8GB.. disk=100GB.. to avoid provisioning instances that are too small.

Offers

The dstack offer CLI command now supports the --fleet argument, which allows you to
see only offers from the specified fleets.

dstack offer --fleet my-fleet --fleet another-project/other-fleet

The same is now supported in the UI on both the Offers and Launch pages.

Exports

Importers can now delete an import via
dstack import delete <export-project>/<export-name>. This is useful when an export
was created by the exporter, but the importer no longer needs it and does not want to
wait until the exporter deletes it.

AWS

RTX Pro 6000

The aws backend adds support for g7e.* instances offering RTXPRO6000 GPUs.

Docker

Default Docker registry

If you'd like to cache Docker images through your own Docker registry, you can now
configure it when starting the dstack server:

export DSTACK_SERVER_DEFAULT_DOCKER_REGISTRY=<registry base hostname>
export DSTACK_SERVER_DEFAULT_DOCKER_REGISTRY_USERNAME=<registry username>
export DSTACK_SERVER_DEFAULT_DOCKER_REGISTRY_PASSWORD=<registry password>

These settings should only be used for registries that act as a pull-through cache for
Docker Hub. This is useful if you would like to avoid rate limits when you have too
many image pulls.

Migration note

Warning

Since v0.20.0, dstack has required fleets before runs can be submitted.

Until now, the deprecated DSTACK_FF_AUTOCREATED_FLEETS_ENABLED feature flag allowed submitting runs without fleets. In 0.20.17, this flag has been removed.

What's changed

Read more

0.20.16-v1

06 Apr 12:05

Choose a tag to compare

Server

Performance

This release introduces a major overhaul of dstack server background processing. A single server
replica can now handle ~10x more resources, supporting at least 1000 active instances and runs. In
benchmarks, we observed 2x-10x faster processing (see #3551).

  • Provisioning 200 instances: 12 minutes -> 4 minutes.
  • Running a 200-node task: >25 minutes -> 4 minutes.
  • Terminating 50 instances: 60 seconds -> 10 seconds.

The performance gains come from a new, more efficient background processing architecture. Server
hardware requirements and memory consumption remain the same.

If you need to temporarily revert this behavior, set
DSTACK_FF_PIPELINE_PROCESSING_DISABLED=1 before starting the server.

Upgrade notes

Warning

This release includes significant internal changes to the dstack server. Test in a staging
environment before upgrading production whenever possible.

Warning

Rolling upgrades from 0.20.13 or older directly to 0.20.16 are not supported. Do not run
replicas on 0.20.13 (or older) and 0.20.16 at the same time. Upgrade to 0.20.15 first, or
scale server replicas down to 1 before upgrading.

SSH proxy

Servers can enforce proxy-only SSH access by combining SSH proxy with the new
DSTACK_SERVER_SSHPROXY_ENFORCED flag. When enabled, runs omit user-provided keys from authorized
lists and expect clients to connect via the proxy endpoint that run details expose. For more details, see the server deployment guide.

Note

SSH proxy is experimental, and behavior may change in future releases.

UI

SSH keys

User settings now include an SSH keys tab where you can upload OpenSSH public keys, see their fingerprints, and remove keys that no longer belong to you. Uploaded keys let you open SSH sessions without relying on the client key that dstack attach manages automatically, and duplicate keys are rejected with a clear error.

CLI

dstack attach

When SSH proxy is enabled on the server, dstack attach now routes through the proxy automatically and receives the proxy host, port, and upstream ID from run connection info. Servers can opt into proxy-only access by setting DSTACK_SERVER_SSHPROXY_ENFORCED, which stops embedding direct SSH keys in runs.

export DSTACK_SERVER_SSHPROXY_ENFORCED=1

Backends

RunPod

RunPod backends can now provision on-demand CPU offerings in secure cloud regions, so jobs that request gpu: 0 schedule successfully without tricking the scheduler. Disk size checks respect the per-offer limits RunPod publishes.

resources:
  gpu: 0
  cpu: 8
  memory: 32GB

Verda

Verda startup scripts and SSH keys are now generated per instance and removed reliably on teardown, preventing stale credentials and improving cleanup when a rollout provisions multiple machines.

Major bug-fixes

  • Improved Git-related CLI repo errors with actionable messages for missing credentials, detached HEAD state, and non-repository directories (#3730).

What's changed

Full changelog: dstackai/dstack@0.20.15...0.20.16