the weekend btrfs snapshots earned their keep

A Linux terminal with filesystem output

I have been quietly sceptical of btrfs for years. Not hostile, just cautious, in the way you're cautious about a tool that has eaten people's data and still ships with a wiki page full of caveats. This weekend it paid me back for that caution by saving the better part of two days.

Here's what happened, and why I'm now running pre-upgrade snapshots on every machine I care about.

the setup

The workstation runs Arch, root on btrfs, with a fairly conventional subvolume layout: @ for root, @home for home, @snapshots for the snapshots themselves. Nothing exotic. The reason it's btrfs at all is that I wanted cheap, fast snapshots, and the reason I wanted those is precisely the situation I'm about to describe.

The one piece of automation that mattered is a pacman hook. Before any transaction, take a read-only snapshot of root:

[Trigger]
Operation = Upgrade
Operation = Install
Operation = Remove
Type = Package
Target = *

[Action]
Description = Snapshotting root before pacman transaction
When = PreTransaction
Exec = /usr/bin/btrfs subvolume snapshot -r / /.snapshots/pre-$(date +%Y%m%d-%H%M%S)

That hook has fired hundreds of times and I had never once needed its output. You can see where this is going.

the breakage

Friday evening, I ran a routine pacman -Syu. Big update, a new kernel, a glibc bump, the usual churn of a system that hadn't been touched in a fortnight. It completed without error. I rebooted.

It did not come back.

What I got instead was an early boot failure, dropped to an emergency shell, with the initramfs unable to mount root cleanly. I'll spare you the full diagnosis because the diagnosis isn't the point. The point is that at eight on a Friday night, staring at a dracut emergency prompt, I had a choice. I could spend the evening bisecting which of forty-odd upgraded packages had done this, with a sick machine and no working browser to search from. Or I could roll back.

A rack of servers in a data centre

the rollback

From the emergency shell I mounted the top-level btrfs volume so I could see all the subvolumes:

mount -o subvolid=5 /dev/sda2 /mnt
ls /mnt/.snapshots/

There was my snapshot, pre-20160617-194212, timestamped a few minutes before the upgrade that had broken everything. The recovery is conceptually simple: the broken @ subvolume needs to step aside, and a writable copy of the good snapshot needs to take its place.

mv /mnt/@ /mnt/@broken-20160617
btrfs subvolume snapshot /mnt/.snapshots/pre-20160617-194212 /mnt/@

Reboot. The machine came straight back up on the pre-upgrade state. Total time from "oh no" to a working desktop was about five minutes, and most of that was me double-checking the subvolume names because I did not want to delete the wrong thing in a panic.

why this beat the alternatives

I've recovered broken systems before, and I know the usual options. A full reinstall is the nuclear one: hours of work and you lose anything not in your dotfiles repo. Booting a live USB to chroot and downgrade individual packages is faster, but you still have to work out which packages, and you need the right cached .pkg.tar.xz files, which you may have just cleaned out.

The snapshot sidesteps all of that. It isn't clever, it's just a known-good point in time that took milliseconds to create and costs almost nothing to keep around because btrfs only stores the differences. The whole value proposition is that the cost is paid in advance, automatically, and you forget it's even happening until the one evening it matters.

A word on what snapshots are not. This is not a backup. The snapshot lived on the same disk as the failure, and if that disk had died, or if I'd hit a filesystem-level corruption rather than a package-level breakage, it would have been useless. Snapshots protect you from your own changes and from bad upgrades. They do not protect you from hardware. I still run borg to a separate machine nightly, and I'd encourage you to do the same. Snapshots and backups solve different problems and you want both.

A server status display in a dim room

the cleanup

Once I was confident the rolled-back system was healthy, I kept the broken subvolume around for a bit so I could investigate at leisure. With the machine working and a browser available, the actual cause took ten minutes to find: a kernel and out-of-tree module that hadn't been rebuilt in step. I sorted it properly the following morning, ran the upgrade again, and this time it booted fine. Then I deleted the evidence:

btrfs subvolume delete /mnt/@broken-20160617

I also added a small retention job so I don't accumulate hundreds of pre-transaction snapshots forever; keeping the last twenty or so is plenty, and btrfs subvolume list plus a bit of sorting handles the pruning.

the takeaway

If you run a rolling-release distro, or frankly any system where an upgrade can leave you unbootable, automate a pre-upgrade snapshot. It does not have to be btrfs; ZFS does the same trick, and on LVM you can fake something similar. The filesystem matters less than the habit.

I went into this weekend mildly distrustful of btrfs and I came out of it grateful. That's not a glowing endorsement, but for the one job I asked it to do, it did the job flawlessly and quickly, on a Friday night, when I least wanted a fight. That counts for a lot.