Skip to content

fix(cmd): add ps command for containerd recovery#601

Open
sidneychang wants to merge 1 commit intourunc-dev:mainfrom
sidneychang:add-ps-command
Open

fix(cmd): add ps command for containerd recovery#601
sidneychang wants to merge 1 commit intourunc-dev:mainfrom
sidneychang:add-ps-command

Conversation

@sidneychang
Copy link
Copy Markdown
Contributor

@sidneychang sidneychang commented May 1, 2026

Description

Add a runc-compatible ps --format json command to urunc.

containerd-shim-urunc-v2 reuses the runc shim manager and task service. During containerd restart, the recovery path calls Pids(), which eventually invokes the runtime binary as:

urunc ps --format json <container-id>

Without this command, containerd cannot obtain the host-visible PID associated with the urunc task and may treat the shim as leaked.

Return the VMM / sandbox monitor PID stored in state.json as a []int, matching the JSON format expected by containerd/go-runc and runc's ps implementation. Keep the result as a slice so it can be extended later if urunc supports multiple VMM or unikernel processes for one container.

Related issues

How was this tested?

Before the fix

The urunc container was running before restarting containerd:

root@sev:~/zxd/urunc-per-container# nerdctl ps -a
CONTAINER ID    IMAGE                                                         COMMAND                   CREATED          STATUS    PORTS    NAMES
ae3a6ebb87fe    harbor.nbfc.io/nubificus/urunc/nginx-qemu-linux-raw:latest    "/urunit /usr/sbin/n…"    5 seconds ago    Up                 urunc-view-check-1

After restarting containerd:

root@sev:~/zxd/urunc-per-container# systemctl restart containerd

the same container became Created:

root@sev:~/zxd/urunc-per-container# nerdctl ps -a
CONTAINER ID    IMAGE                                                         COMMAND                   CREATED           STATUS     PORTS    NAMES
ae3a6ebb87fe    harbor.nbfc.io/nubificus/urunc/nginx-qemu-linux-raw:latest    "/urunit /usr/sbin/n…"    14 seconds ago    Created             urunc-view-check-1

The state stayed Created on the next check:

root@sev:~/zxd/urunc-per-container# nerdctl ps -a
CONTAINER ID    IMAGE                                                         COMMAND                   CREATED           STATUS     PORTS    NAMES
ae3a6ebb87fe    harbor.nbfc.io/nubificus/urunc/nginx-qemu-linux-raw:latest    "/urunit /usr/sbin/n…"    26 seconds ago    Created             urunc-view-check-1

The urunc shim process was still alive:

root@sev:~/zxd/urunc-per-container# ps aux | grep shim
root      512463  0.0  0.0 1233708 4704 ?        Sl   04:39   0:00 /usr/local/bin/containerd-shim-urunc-v2 -namespace default -id ae3a6ebb87fea672711de7b7ab75315eb7e39605330b71236d607657b1707c9a -address /run/containerd/containerd.sock -debug
root      513223  0.0  0.0   6548  1544 pts/1    S+   04:40   0:00 grep --color=auto shim

After stopping and removing the container:

root@sev:~/zxd/urunc-per-container# nerdctl stop ae3
ae3
root@sev:~/zxd/urunc-per-container# nerdctl rm ae3
ae3

the shim was still left behind:

root@sev:~/zxd/urunc-per-container# ps aux | grep shim
root      512463  0.0  0.0 1233708 4704 ?        Sl   04:39   0:00 /usr/local/bin/containerd-shim-urunc-v2 -namespace default -id ae3a6ebb87fea672711de7b7ab75315eb7e39605330b71236d607657b1707c9a -address /run/containerd/containerd.sock -debug
root      513492  0.0  0.0   6548  1548 pts/1    S+   04:40   0:00 grep --color=auto shim

At that point, the container metadata had already disappeared:

root@sev:~/zxd/urunc-per-container# nerdctl ps -a
CONTAINER ID    IMAGE    COMMAND    CREATED    STATUS    PORTS    NAMES

This shows the failure mode before the fix: containerd restart caused the urunc container to fall from Up to Created, and cleanup left a live containerd-shim-urunc-v2 process behind.

After the fix

After replacing the binaries with the fixed version, a new urunc container was started:

root@sev:~/zxd/urunc-per-container#nerdctl -n default  --snapshotter devmapper run -d --name urunc-view-check-1   --runtime io.containerd.urunc.v2   harbor.nbfc.io/nubificus/urunc/nginx-qemu-linux-raw:latest
4426490c6afd310ebe8653126f0479efe25ed60c32b48e585d9f95e9b028fd17

The container was Up before restarting containerd:

root@sev:~/zxd/urunc-per-container# nerdctl ps -a
CONTAINER ID    IMAGE                                                         COMMAND                   CREATED          STATUS    PORTS    NAMES
4426490c6afd    harbor.nbfc.io/nubificus/urunc/nginx-qemu-linux-raw:latest    "/urunit /usr/sbin/n…"    5 seconds ago    Up                 urunc-view-check-1

After restarting containerd:

root@sev:~/zxd/urunc-per-container# systemctl restart containerd

the container remained Up:

root@sev:~/zxd/urunc-per-container# nerdctl ps -a
CONTAINER ID    IMAGE                                                         COMMAND                   CREATED           STATUS    PORTS    NAMES
4426490c6afd    harbor.nbfc.io/nubificus/urunc/nginx-qemu-linux-raw:latest    "/urunit /usr/sbin/n…"    17 seconds ago    Up                 urunc-view-check-1

The container was then stopped and removed normally:

root@sev:~/zxd/urunc-per-container# nerdctl stop 4426
4426
root@sev:~/zxd/urunc-per-container# nerdctl rm 4426
4426

After removal, no urunc shim was left behind; only the grep process appeared:

root@sev:~/zxd/urunc-per-container# ps aux | grep shim
root      515037  0.0  0.0   6548  1544 pts/1    S+   04:42   0:00 grep --color=auto shim

LLM usage

ChatGPT

Checklist

  • I have read the contribution guide.
  • The linter passes locally (make lint).
  • The e2e tests of at least one tool pass locally (make test_ctr, make test_nerdctl, make test_docker, make test_crictl).
  • If LLMs were used: I have read the llm policy.

@netlify
Copy link
Copy Markdown

netlify Bot commented May 1, 2026

Deploy Preview for urunc canceled.

Name Link
🔨 Latest commit 358ca69
🔍 Latest deploy log https://app.netlify.com/projects/urunc/deploys/69f51b014b75900007b15f10

@sidneychang sidneychang marked this pull request as ready for review May 1, 2026 21:26
Add a runc-compatible `ps --format json` command to urunc.

containerd-shim-urunc-v2 reuses the runc shim manager and task
service. During containerd restart, the recovery path calls Pids(),
which eventually invokes the runtime binary as:

    urunc ps --format json <container-id>

Without this command, containerd cannot obtain the host-visible PID
associated with the urunc task and may treat the shim as leaked.

Return the VMM / sandbox monitor PID stored in state.json as a []int,
matching the JSON format expected by containerd/go-runc and runc's ps
implementation. Keep the result as a slice so it can be extended later
if urunc supports multiple VMM or unikernel processes for one container.

Signed-off-by: sidneychang <2190206983@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

urunc containers fall back to Created after containerd restart

1 participant