replacing every disk in a pool without losing the pool

A server rack with disks

The pool was full, or close enough that I had stopped trusting it. Eight bays, six of them holding 4TB drives that were getting on for five years old, and a free-space graph that had been creeping towards the red for months. I did not want to build a new box. I wanted to grow the pool in place, which with ZFS means a particular ritual: replace every disk in a vdev with a larger one, one at a time, and the pool quietly expands to fill the new space once the last one is in.

This is the great disk shuffle, and it is one of the genuinely satisfying things about running ZFS at home. No copying everything off, no rebuild from backup, no downtime. Just patience, and a willingness to run degraded for a few hours per disk while it resilvers. I did it over a weekend on TrueNAS Scale and it went almost entirely to plan, which after years of homelab work I have learned to be suspicious of.

The mechanic

The vdev was a six-wide raidz2, which means it tolerates two simultaneous disk failures. The plan was to swap each 4TB for an 8TB, one at a time. The important property of raidz2 here is the margin: while one disk is being replaced and the array is resilvering onto the new drive, you are effectively running raidz1 for that window. You still have one disk of redundancy left. If a second drive had died mid-resilver I would have been fine. If I had been on raidz1 and a disk had failed during a replace, that would have been the end of the pool, and that risk is exactly why I would not attempt this dance on single-parity in the first place.

The actual steps per disk, in the TrueNAS UI, came down to:

Storage > Pool > Status
  - pick the old disk, "Offline"
  - physically pull it, insert the 8TB
  - "Replace" on the now-missing slot, choose the new disk
  - wait for resilver to complete before touching the next one

The cardinal rule, in bold in my own notes: wait for the resilver to finish before pulling the next drive. Each resilver dropped me to single redundancy for its duration. Pulling a second disk before the first replacement had fully resilvered would have meant two missing disks of parity at once, which on raidz2 is the edge of the cliff. So I went one at a time, no heroics, no parallelism.

A homelab setup with cabling

The waiting game

Resilvering is where the patience comes in. Each 4TB-to-8TB replacement took the better part of six to eight hours, because resilver time scales with how much data is on the vdev, not the size of the new disk, and the pool was full, which is the whole reason I was doing this. Six disks, six resilvers, mostly overnight. You can keep using the pool while it resilvers, TrueNAS does not stop you, but I throttled my own usage because a busy pool resilvers more slowly and I wanted each window to be as short as possible.

You watch it with:

zpool status -v tank

and you get a percentage and an estimate that lies to you early and settles down later. The estimate for the first disk swung between three hours and eleven before it calmed. Do not trust the first half hour of any resilver estimate. Go and do something else.

The two things that nearly bit me

First, SMR. The drives I had bought needed checking, because shingled drives are catastrophically slow at the random write pattern a resilver produces, and a couple of the cheaper 8TB models on the market use SMR without shouting about it. I checked every model number against the manufacturer's specifications before buying, and bought CMR drives only. Had I not, the resilvers might have taken days each instead of hours, and SMR in a ZFS pool is a known way to make yourself miserable.

Second, the autoexpand flag. After the last disk finished resilvering I waited for the pool to grow and nothing happened. The capacity stayed exactly where it had been. The pool only expands to use the new space once every disk in the vdev is larger, which was now true, but only if autoexpand is on, and it defaults to off. One command fixed it:

zpool set autoexpand=on tank

and the free space appeared, all of it at once, the pool jumping from nearly-full to comfortably half-empty. I had a brief moment of believing I had done six resilvers for nothing, which was its own kind of education in reading the documentation before, rather than after.

Was it worth it

Completely. The pool went from 24TB raw to 48TB raw, the data never left, there was no downtime that mattered, and the old 4TB drives are now a cold-spare pile and an offsite backup set. The whole thing cost a weekend of mostly-waiting and the price of six drives. That is the quiet pleasure of ZFS: the boring, methodical path actually works, the redundancy is real, and if you respect the resilver windows and check your drives are CMR, the great disk shuffle is about as stressful as watching a progress bar. Which, gloriously, is exactly what it is.