Ramblings of an aging IT geek
← Ramblings of an aging IT geek
homelab

the backups i actually restore from now

After a near miss with a corrupt array I stopped trusting backups I had never restored, and started running monthly test restores.

A server rack in a home lab

For years I had backups in the way most people have a smoke alarm with a flat battery. They existed. They were green in the dashboard. I had never once restored from them, which means I did not have backups, I had a feeling.

The wake-up call was a degraded array and a tense afternoon spent reading restic documentation I should have read while calm. The data came back, but only after I discovered one of my repos had been silently failing its prune for weeks because the cron job's environment was missing a password file. Green dashboard, broken backup. Classic.

So now the rule is boring and absolute: a backup I have not restored is not a backup. Once a month I pick a random snapshot and pull it into a scratch directory, diff a handful of files, and check restic check --read-data-subset on a slice of the pack files. It takes ten minutes and it has caught two problems since I started.

The other change was making the test loud. A successful restore pings a healthcheck; a missed run sends me a nudge. The failure mode I care about is silence, the job that quietly stopped running and told no one. Backups are easy. Restores are the actual product, and you only find out which kind you have on the worst day.