Ramblings of an aging IT geek
← Ramblings of an aging IT geek
homelab

rebuilding a pool one disk at a time without losing my nerve

Growing a ZFS mirror to larger drives by replacing one disk at a time and resilvering, on a box recently renamed from FreeNAS to TrueNAS.

A server rack with drive bays lit up

My storage box had run out of room, and the only honest fix was bigger disks. The pool was a pair of 4TB drives in a mirror, and I had two 8TB drives sitting in their boxes feeling smug. The plan was simple on paper: replace one disk, let it resilver, replace the other, let it resilver, then let ZFS expand into the new space. The plan is always simple on paper.

ZFS makes this genuinely safe, which is the whole reason I run it. You are never without redundancy for more than a moment, because you swap one half of the mirror while the other half is still live. The command is unglamorous:

zpool replace tank gptid/old-disk-uuid gptid/new-disk-uuid
zpool status tank

A homelab shelf of disks and cables mid-rebuild

The first resilver took the best part of a day because the pool was fairly full and these are spinning disks, not flash. I watched zpool status more than was healthy, the way you watch a kettle. No errors, scan completed, one new disk in. Then the second swap, another long resilver, and the bit that catches people out: nothing got bigger automatically. ZFS will not expand a mirror until both members are large, and only then if autoexpand is on.

zpool set autoexpand=on tank
zpool online -e tank gptid/new-disk-uuid

And there it was, the pool quietly grew to the new size with no downtime and no panic. I did not even unmount anything.

Two things worth saying. First, this is the same week the FreeNAS folks started talking properly about the TrueNAS naming bringing the open and the appliance editions together, and it made me appreciate again that the thing underneath is just ZFS doing exactly what it promises. Second, and I cannot stress this enough: I had backups before I touched a single disk. Resilvering stresses the surviving drive harder than normal use, and the classic disaster is the second disk dying while the first one rebuilds. It did not happen to me. It happens to someone every week. Have the backup, then do the clever thing.