Skip to content

Port the init binary code to Rust#670

Open
jakecorrenti wants to merge 25 commits into
libkrun:mainfrom
jakecorrenti:port-init
Open

Port the init binary code to Rust#670
jakecorrenti wants to merge 25 commits into
libkrun:mainfrom
jakecorrenti:port-init

Conversation

@jakecorrenti

@jakecorrenti jakecorrenti commented May 7, 2026

Copy link
Copy Markdown
Collaborator

This PR ports the init binary code to Rust. It acts like any of the other crates that we have within the project.

To run the examples or with Podman, you would build the project as usual: make BLK=1 NET=1 && sudo make BLK=1 NET=1 install and continue with business as usual.

In my testing I've found the init binary to still remain small:

libkrun/init port-init ≡
❯ ll ../target/release/krun-init
.rwxr-xr-x@ 657k jcorrent  4 Jun 11:16 -I ../target/release/krun-init

NOTE: the AWS Nitro init binary is still written in C and will be ported next.

Fixes: #632

Comment thread src/init_blob/build.rs
mtjhrc added a commit to mtjhrc/libkrun that referenced this pull request Jun 9, 2026
Add four crates that together produce libkrun-init — a standalone
shared library for building init configurations with a C FFI:

- init-blob: Config/ConfigBuilder/GuestFile types with
  #[ffier::exportable]. Serializes to OCI runtime-spec JSON matching
  what the Rust init (PR libkrun#670) parses. ConfigError with FfiError
  derive for proper error handling across FFI.

- init-blob-cdylib: produces libkrun_init.so via ffier bridge macros.
  Includes krun-init-blob-gen binary for generating C headers and
  Rust client bindings (strong or --weak mode).

- init-blob-via-cdylib: strong Rust client (links libkrun_init.so
  at load time). For consumers that link both libraries.

- init-blob-via-cdylib-weak: weak Rust client (dlsym at runtime).
  For libkrun, which optionally calls into libkrun-init without a
  hard link dependency.

All crates live under init/ alongside the PID-1 binary.

Uses library_tag = 2 (reserving 1 for libkrun) and
primitives_prefix = "krun" so KrunStr/KrunBytes are shared.

Makefile: gen-init-blob-bindings target, versioned install of
libkrun_init.so (SONAME 0), C header, and pkg-config file.

Assisted-by: OpenCode:claude-opus-4.6
Signed-off-by: Matej Hrica <mhrica@redhat.com>
mtjhrc added a commit to mtjhrc/libkrun that referenced this pull request Jun 10, 2026
Add four crates that together produce libkrun-init — a standalone
shared library for building init configurations with a C FFI:

- init-blob: Config/ConfigBuilder/GuestFile types with
  #[ffier::exportable]. Serializes to OCI runtime-spec JSON matching
  what the Rust init (PR libkrun#670) parses. ConfigError with FfiError
  derive for proper error handling across FFI.

- init-blob-cdylib: produces libkrun_init.so via ffier bridge macros.
  Includes krun-init-blob-gen binary for generating C headers and
  Rust client bindings (strong or --weak mode).

- init-blob-via-cdylib: strong Rust client (links libkrun_init.so
  at load time). For consumers that link both libraries.

- init-blob-via-cdylib-weak: weak Rust client (dlsym at runtime).
  For libkrun, which optionally calls into libkrun-init without a
  hard link dependency.

All crates live under init/ alongside the PID-1 binary.

Uses library_tag = 2 (reserving 1 for libkrun) and
primitives_prefix = "krun" so KrunStr/KrunBytes are shared.

Makefile: gen-init-blob-bindings target, versioned install of
libkrun_init.so (SONAME 0), C header, and pkg-config file.

Assisted-by: OpenCode:claude-opus-4.6
Signed-off-by: Matej Hrica <mhrica@redhat.com>
mtjhrc added a commit to mtjhrc/libkrun that referenced this pull request Jun 10, 2026
Add four crates that together produce libkrun-init — a standalone
shared library for building init configurations with a C FFI:

- init-blob: Config/ConfigBuilder/GuestFile types with
  #[ffier::exportable]. Serializes to OCI runtime-spec JSON matching
  what the Rust init (PR libkrun#670) parses. ConfigError with FfiError
  derive for proper error handling across FFI.

- init-blob-cdylib: produces libkrun_init.so via ffier bridge macros.
  Includes krun-init-blob-gen binary for generating C headers and
  Rust client bindings (strong or --weak mode).

- init-blob-via-cdylib: strong Rust client (links libkrun_init.so
  at load time). For consumers that link both libraries.

- init-blob-via-cdylib-weak: weak Rust client (dlsym at runtime).
  For libkrun, which optionally calls into libkrun-init without a
  hard link dependency.

All crates live under init/ alongside the PID-1 binary.

Uses library_tag = 2 (reserving 1 for libkrun) and
primitives_prefix = "krun" so KrunStr/KrunBytes are shared.

Makefile: gen-init-blob-bindings target, versioned install of
libkrun_init.so (SONAME 0), C header, and pkg-config file.

Assisted-by: OpenCode:claude-opus-4.6
Signed-off-by: Matej Hrica <mhrica@redhat.com>
Replace the C-based build_default_init() in src/devices/build.rs with a
Rust crate (init/) compiled via a cargo subprocess. The new build.rs
probes whether the active rustc supports the $(uname -m)-unknown-linux-musl
target (for a static binary) and falls back to the native target with a
user-visible warning if not.

The KRUN_INIT_BINARY_PATH override mechanism is preserved so that
out-of-tree binaries (e.g. pre-built SEV or TDX images) can still be
injected without rebuilding.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Add init/src/fs.rs with:
- mount_or_busy(): helper that treats EBUSY as success
- mount_filesystems(): mounts devtmpfs, proc, sysfs, cgroup2,
  devpts, tmpfs(/dev/shm), and creates the /dev/fd symlink
- is_mount_point(): parses /proc/mounts (avoids triggering Podman
  auto-mounts that stat() would cause)
- mount_tmpfs(): mounts a tmpfs at an arbitrary path

Implement mount_tee_block_root() function used
by both SEV and TDX features to mount /dev/vda and chroot into it.

For amd-sev this replaces the previous LUKS/KBS attestation path
entirely. The SEV and TDX boot paths are now identical at the init level.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Extend fs.rs with:
- try_mount(): mounts with a known fstype, or probes /proc/filesystems
  when fstype is None
- mount_block_root_device(): handles KRUN_BLOCK_ROOT_DEVICE by mounting
  the block device at /newroot, issuing KRUN_REMOVE_ROOT_DIR_IOCTL to
  drop the virtiofs temporary root, then pivoting with MS_MOVE
- mount_shared_root(): sets MS_REC|MS_SHARED propagation on /

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Port init/dhcp.c to Rust in init/src/dhcp.rs. The public surface is a
single do_dhcp(iface) function with the same behaviour as the C version:

- Sends DHCPDISCOVER with Rapid Commit (option 80)
- On DHCPACK: applies address, route, MTU, and DNS directly
- On DHCPOFFER: completes the 4-way handshake, then applies
- On no response: returns Ok (VM may be IPv6-only)

Netlink structs not exposed by libc (ifinfomsg, ifaddrmsg, rtmsg) are
defined locally with #[repr(C)]. sockaddr_nl and sockaddr_in are
zero-initialised via mem::zeroed() to handle opaque padding fields.

Assisted-by: Claude Code:claude-sonnet-4.6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Add init/src/config.rs, replacing the hand-rolled jsmn-based parser
with serde_json. Parses /.krun_config.json (or KRUN_CONFIG env var) and
returns a Config struct with:

- argv: Entrypoint ++ (args | Cmd), or None if absent
- workdir: WorkingDir or Cwd
- tmpfs: first tmpfs mount destination not already mounted

Environment variables from the Env array are applied during parsing,
with HOME and TERM always overwritten, all others set only if unset.
A missing or unparseable config file is silently ignored.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Add setup_network() and setup_dhcp() to env.rs.

setup_network() brings up lo unconditionally. setup_dhcp() checks that
the interface exists before calling do_dhcp(), and logs a warning on
failure rather than aborting (DHCP failure is non-fatal — the VM may be
IPv6-only or have no network).

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Extend env.rs with:
- apply_hostname(): sets hostname from HOSTNAME env var, defaulting
  to "localhost"
- apply_env(): maps KRUN_HOME -> HOME and KRUN_TERM -> TERM
- apply_rlimits(): parses the KRUN_RLIMITS comma-separated list of
  id,cur,max triples and applies each via setrlimit(2)

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Add exec.rs with:
- setup_redirects(): walks /sys/class/virtio-ports and dup2s
  krun-stdin/stdout/stderr onto the corresponding file descriptors
- set_exit_code(): reports the workload exit code to the host via
  KRUN_EXIT_CODE_IOCTL, only when the root fs is virtiofs
- run_workload(): forks so PID 1 can reap children; the child calls
  exec_workload() which sets up redirects and execvp's the argv.
  Parent waits for the child, reports exit code, syncs, and reboots.
  KRUN_INIT_PID1=1 skips the fork and exec_workload directly as PID 1.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Connect all modules in main() in order:
  1. mount_block_root()          [amd-sev | tdx]
  2. mount_filesystems()
  3. mount_block_root_device()   [KRUN_BLOCK_ROOT_DEVICE]
  4. mount_shared_root()
  5. setsid + TIOCSCTTY
  6. setup_network()
  7. config::load()
  8. mount_tmpfs()               [config tmpfs mount]
  9. apply_env / apply_hostname / apply_rlimits
 10. chdir to workdir
 11. run_workload(argv)

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Add init/src/freebsd.rs with:
- kenv_get(): reads a variable from the FreeBSD kernel environment via
  kenv(2), which is the source of env vars for init before the process
  environment is set up
- populate_env_from_kenv(): imports the known KRUN_* variables from
  kenv into std::env at startup so the rest of the code can use
  std::env::var uniformly on both platforms
- open_console(): replicates login_tty(3) without linking libutil —
  revokes existing opens of /dev/console, opens it, creates a new
  session via setsid(2), sets the controlling terminal via TIOCSCTTY,
  and dup2s it onto stdio; falls back to /dev/null + /init.log
- mount_config_iso() / unmount_config_iso(): mounts the KRUN_CONFIG
  ISO 9660 image at /mnt via nmount(2) so the JSON config file can be
  read, then unmounts it afterwards

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Connect the FreeBSD helpers into the boot sequence:
- open_console() and populate_env_from_kenv() are called at the very
  start of main() before anything else
- setsid/TIOCSCTTY are Linux-only; open_console() handles session setup
  on FreeBSD
- setlogin("root") is called on FreeBSD after console setup
- KRUN_DHCP and DHCP setup are Linux-only
- If KRUN_CONFIG is not set, mount_config_iso() is attempted; the ISO
  is unmounted immediately after config::load() returns
- fs::* mounts and mount_shared_root are Linux-only
- exec_workload() calls open_console() on FreeBSD instead of
  setup_redirects(), giving the child process a fresh controlling
  terminal before execvp

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Replace the C-based BSD init build rule (which referenced the now-deleted
init/init.c) with a cargo build rule targeting the correct Rust triple.

Makefile:
- Remove dead INIT_SRC = init/init.c variable.
- Derive FREEBSD_RUST_TARGET from the host ARCH with arm64→aarch64
  substitution to get the correct Rust triple.
- Set CARGO_BSD_RUSTFLAGS with the clang cross-linker flags (mirroring
  the existing CC_BSD setup) so cargo can link for FreeBSD.
- aarch64-unknown-freebsd is a Tier 3 target with no prebuilt std;
  use +nightly -Z build-std for that case.

setup-build-env:
- Add rustup target add x86_64-unknown-freebsd (Tier 2, prebuilt std).
- Install nightly toolchain + rust-src for the aarch64 FreeBSD case.

cross-compilation.yml:
- Add clang to the Linux cross-compilation dependencies so the
  FreeBSD linker flags resolve correctly on Linux runners.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Implements the timesync feature behind the `timesync` cargo feature flag.
Receives host-side nanosecond timestamps over AF_VSOCK/SOCK_DGRAM on port
123 and applies them via clock_settime when the delta exceeds 100ms.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Delete init/init.c, init/dhcp.c, init/dhcp.h, init/jsmn.h, and the
entire init/tee/ directory (snp_attest.c/h and the KBS client).

The amd-sev feature no longer performs LUKS unlock or KBS attestation —
it mounts /dev/vda as ext4 like the tdx path does.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Assisted-by: Claude Code:claude-sonnet-4.6
Port of cd8b2be. The temporary root
directory hack has been replaced by NullFs, so the ioctl that cleaned
it up is no longer needed.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Port of 2593acc. When TSI is active,
brings up dummy0 and assigns it 10.0.0.1/8 so applications that probe
for network availability see a configured interface. Silently skips
setup if the dummy driver is absent in the kernel.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
The non-Linux setup_network() stub was empty, so the lo interface was
never raised inside FreeBSD guests. The C init unconditionally brought up
lo on all platforms (the #if __linux__ guard covered only the DHCP block,
not lo setup).

Use nix::sys::socket to open an AF_INET/SOCK_DGRAM socket and issue
SIOCSIFFLAGS / IFF_UP, matching the C behaviour.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
The C init accepted "Cmd", "Env", "WorkingDir"/"Cwd", and "Entrypoint"
keys via case-insensitive comparison; the Rust port only handled OCI
runtime-spec keys ("args", "env", "cwd" inside "process").

Add serde aliases so RawConfig's flat fields also accept the Docker image
config capitalisation:
  - "Cmd" aliases "args"
  - "Env" aliases "env"
  - "WorkingDir"/"Cwd" alias "cwd"
  - new "Entrypoint" field (top-level, Docker format only)

When Entrypoint is present it is prepended to the resolved args vector,
matching the C init's concat_entrypoint_argv() behaviour.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Two issues with apply_rlimits():

1. The C init parsed KRUN_RLIMITS with strtoull() and a single-char skip,
   so any separator character between ID, CUR, and MAX worked (e.g.
   "7:1024:4096").  The Rust code required "ID=CUR:MAX" and silently
   skipped entries using the historical colon-only format.

2. krun_set_rlimits() wraps the entire value in double-quotes
   (format!("\"{}\"", ...)), so the env var received by init is
   "\"7=1024:4096\"".  Neither the old Rust nor the C parser handled
   this correctly.

Fix both by extracting parse_rlimit_entry() which strips outer '"' chars
and splits on the first two occurrences of '=' or ':' via splitn(3).
Both formats and the quoted form now parse correctly.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
The C init replaced argv[0] with "/bin/sh" when neither KRUN_INIT nor
a config file was present, forwarding remaining cmdline tokens as shell
arguments.  The Rust init instead treats proc_args[1] as the executable
directly.

Add a comment explaining the rationale: callers that omit both KRUN_INIT
and a config file intend the cmdline argument to be the command, not a
shell script path, making the Rust behaviour more intuitive.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
A thread is destroyed when the parent calls execvp() (in PID1 mode).
The C init ran clock_worker() in a forked child process, which survives
exec.  Match that behaviour: create the vsock socket, fork, and run the
recv loop in the child; the parent closes its copy of the socket and
returns immediately.

Also switch to nix wrappers throughout: socket::socket(), socket::recv(),
time::clock_gettime(), and time::clock_settime() replace the equivalent
unsafe libc calls.  Add the nix "time" feature to support the clock
functions.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
The C init called exit(125) if setup_redirects() returned a negative
value (which it did when opendir("/sys/class/virtio-ports") failed).
The Rust port returned silently, letting the workload run with
unredirected stdio and no diagnostic.

Match the C behaviour: print an error and exit(125) so callers get a
visible signal that the redirects could not be set up.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
The C init checked *env_init_pid1 == '1' (first-byte comparison),
accepting any value starting with '1' — including "10" or "1\n" (which
can appear when the value originates from a file read).  The Rust port
used exact equality with "1", silently ignoring those variants.

Replace with is_ok_and(|v| v.starts_with('1')).

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Port of upstream commit 378e524 ("init/dhcp: only overwrite
resolv.conf with DNS"). Only write /etc/resolv.conf when the DHCP
server provides nameservers, preserving any pre-existing content.

Assisted-by: Claude Code: claude-sonnet-4-6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Best effort loosen /dev/kvm permissions to allow nested virtualization
by unprivileged processes inside the microVM (usually a single purpose
environment). Log errors but don't log ENOENT since the guest kernel
may not support KVM or nested virtualization might not be enabled.

Port of libkrun#708.

Assisted-by: Claude Code:opus-4.6
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rewrite init in Rust

4 participants