Skip to content

feat: add fluidstack-ironwood DIB element and matrix workflow build#10

Merged
nacef3005 merged 1 commit into
mainfrom
nacef/tpu-ironwood-support
Jun 10, 2026
Merged

feat: add fluidstack-ironwood DIB element and matrix workflow build#10
nacef3005 merged 1 commit into
mainfrom
nacef/tpu-ironwood-support

Conversation

@nacef3005

@nacef3005 nacef3005 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Adds first-class IPA ramdisk support for Ironwood TPU machines alongside the existing fish cluster build.

dib/fluidstack-ironwood — new DIB element

The element bundles everything IPA needs to provision Ironwood TPU machines:

  • Network fixes — 10-bmc-usb.network suppresses the spurious IPv6 default route advertised by the BMC USB NIC (enp0s20f0u8*), which would otherwise steal source-address selection away from the DCN NIC and break IPA-to-Ironic connectivity. 10-dcn-ironwood.network pre-sets MTU 9100 on idpf interfaces before systemd-networkd processes the RA MTU option, avoiding the Failed to set IPv6 MTU error at boot.
  • SSH over IPv6 — ssh.socket.d/ipv6.conf extends the Debian ssh.socket to listen on [::]:22; Ironwood DCN interfaces are IPv6-only so SSH is otherwise unreachable.
  • hwclock wrapper — Ironwood BMCs expose no hardware RTC over the USB management interface; the wrapper prevents a hwclock --systohc failure during IPA teardown from aborting provisioning.
  • IronwoodHardwareManager — a custom IPA hardware manager that adds a kexec_boot deploy step. After image deployment it kexecs directly into the installed OS, bypassing LinuxBoot's Verified Disk Boot (which requires LUKS encryption and EEPROM unlock on Ironwood machines).
  • ironwood-auto-kexec service — a systemd oneshot that runs at IPA boot before ironic-python-agent.service. If the Ironic node is already in active state (already provisioned), it kexecs into the installed OS immediately, avoiding a full re-provisioning cycle on every reboot. Supports both a separate BOOT-labelled partition (Ubuntu cloud images) and /boot/ inside the root partition (DIB-built images).
  • kexec-tools package — required by both the service and the hardware manager.

.github/workflows/ipa-ramdisk-build.yml — matrix build

The workflow is restructured as a matrix job so both images build in parallel on every push to main:

Variant Distro Release Image name

fish CentOS 9-stream ipa-centos9-<branch>-fs
ironwood Debian trixie ipa-debian-trixie-<branch>-fs

Per-variant elements (fluidstack-ironwood for ironwood, none for fisTRA_ELEMENTS env var so the shell loop handles the empty-string casecleanly. The fish build no longer includes --element fluidstack-ironwood since those fixes are Ironwood-specific. Both archives are uploaded to the same S3 prefix; the filename differentiates them.

Adds first-class IPA ramdisk support for Ironwood TPU machines alongside the existing fish cluster build.

dib/fluidstack-ironwood — new DIB element

The element bundles everything IPA needs to provision Ironwood TPU machines:

- Network fixes — 10-bmc-usb.network suppresses the spurious IPv6 default route advertised by the BMC USB NIC (enp0s20f0u8*), which would otherwise steal source-address selection away from the DCN NIC and break IPA-to-Ironic connectivity. 10-dcn-ironwood.network pre-sets MTU 9100 on idpf interfaces before systemd-networkd processes the RA MTU option, avoiding the Failed to set IPv6 MTU error at boot.
- SSH over IPv6 — ssh.socket.d/ipv6.conf extends the Debian ssh.socket to listen on [::]:22; Ironwood DCN interfaces are IPv6-only so SSH is otherwise unreachable.
- hwclock wrapper — Ironwood BMCs expose no hardware RTC over the USB management interface; the wrapper prevents a hwclock --systohc failure during IPA teardown from aborting provisioning.
- IronwoodHardwareManager — a custom IPA hardware manager that adds a kexec_boot deploy step. After image deployment it kexecs directly into the installed OS, bypassing LinuxBoot's Verified Disk Boot (which requires LUKS encryption and EEPROM unlock on Ironwood machines).
- ironwood-auto-kexec service — a systemd oneshot that runs at IPA boot before ironic-python-agent.service. If the Ironic node is already in active state (already provisioned), it kexecs into the installed OS immediately, avoiding a full re-provisioning cycle on every reboot. Supports both a separate BOOT-labelled partition (Ubuntu cloud images) and /boot/ inside the root partition (DIB-built images).
- kexec-tools package — required by both the service and the hardware manager.

.github/workflows/ipa-ramdisk-build.yml — matrix build

The workflow is restructured as a matrix job so both images build in parallel on every push to main:

┌──────────┬────────┬──────────┬───────────────────────────────┐
│ Variant  │ Distro │ Release  │          Image name           │
├──────────┼────────┼──────────┼───────────────────────────────┤
│ fish     │ CentOS │ 9-stream │ ipa-centos9-<branch>-fs       │
├──────────┼────────┼──────────┼───────────────────────────────┤
│ ironwood │ Debian │ trixie   │ ipa-debian-trixie-<branch>-fs │
└──────────┴────────┴──────────┴───────────────────────────────┘
                                                                                                                                                                Per-variant elements (fluidstack-ironwood for ironwood, none for fisTRA_ELEMENTS env var so the shell loop handles the empty-string casecleanly. The fish build no longer includes --element fluidstack-ironwood since those fixes are Ironwood-specific. Both archives are uploaded to the same S3 prefix; the filename differentiates them.
@nacef3005 nacef3005 self-assigned this Jun 8, 2026

nacef3005 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

@nacef3005 nacef3005 marked this pull request as ready for review June 8, 2026 14:11

nacef3005 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Merge activity

  • Jun 10, 1:29 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jun 10, 1:29 PM UTC: @nacef3005 merged this pull request with Graphite.

@nacef3005 nacef3005 merged commit cb0fe29 into main Jun 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants