Ramblings of an aging IT geek
← Ramblings of an aging IT geek
linux

moving root onto zfs, and the boot snag nobody warns you about

Migrating a Linux root filesystem onto ZFS for boot environments and snapshots, including the initramfs and bootloader gotchas that catch you out.

A Linux terminal showing storage pool status

I moved a box's root filesystem onto ZFS at the weekend. Not the data disks, which have been on ZFS for ages, but the actual root, the thing the OS boots from. The reason is one feature: boot environments. Snapshot the root before an upgrade, and if the upgrade goes sideways you roll back to a working system in seconds instead of reaching for a rescue USB.

A rack server with disks visible

The appeal, briefly. With root on ZFS you get atomic snapshots of the entire OS for free. Before apt full-upgrade you take a snapshot. The upgrade eats itself, as upgrades occasionally do? You boot the previous snapshot and you're back. Combined with native compression (lz4 is effectively free) and the checksumming that's caught silent corruption on my data pools more than once, it's a genuinely better place for a root filesystem to live than ext4 on LVM.

The migration itself is the boring part: create the pool, set sensible properties, rsync the live system across, fix up /etc/fstab and the mountpoints. The part that bit me, and bites everyone, is boot.

ZFS is not in the mainline kernel, and it can't be, because of the CDDL and GPL licence incompatibility. That means the kernel has to load ZFS as an out-of-tree module, and your initramfs has to contain that module plus the userland tooling to import the pool, all before it can mount a root that lives on ZFS. A normal initramfs doesn't ship any of that. So you get a kernel that boots, drops into a busybox shell, and sits there unable to find its own root, which is a deeply unsettling thing to watch on a machine you've just "improved".

The fix is to make sure the ZFS hooks are baked into the initramfs and that the bootloader passes the right root. On a Debian-family system with the zfs-initramfs package installed:

# tell the system root lives on a ZFS dataset
zpool set bootfs=rpool/ROOT/debian rpool

# rebuild the initramfs with the zfs hooks
update-initramfs -u -k all

# regenerate grub so it points root at the dataset
update-grub

Check the generated GRUB entry actually says root=ZFS=rpool/ROOT/debian and not a /dev/sd* path. If it's still pointing at a block device, GRUB will hand the kernel a root it can't mount and you're back in the busybox shell. The other classic snag is pool import at boot: if the pool isn't imported cleanly, set the cachefile so the initramfs knows which pool to grab without scanning every disk:

zpool set cachefile=/etc/zfs/zpool.cache rpool

Two genuine warnings. First, keep a known-good kernel and a rescue environment to hand the first time you reboot, because a kernel update that rebuilds the ZFS module against a new kernel can fail silently and leave you unbootable. DKMS usually handles it; "usually" is doing work in that sentence. Second, this couples your ability to boot to an out-of-tree module that has to be recompiled for every kernel bump, which is a real maintenance tax you're signing up for on every box you do this to.

Worth it? On a server where rollback-on-upgrade and checksumming earn their keep, yes, easily. On a laptop I'd think harder, because the boot fragility and the kernel-coupling are a faff you'll feel on every update. Either way, the first reboot is the moment of truth, and now you know which shell you'll land in when it goes wrong, and which three commands get you out.