Ramblings of an aging IT geek
← Ramblings of an aging IT geek
homelab

the backup you have not restored is a rumour

A monthly restore drill that turned my homelab backups from a comforting belief into something I can actually count on, and the one cron job that makes it stick.

A server rack with status lights in a dim room

For years my backups were a rumour. Restic ran nightly, the repository grew, the dashboard was green, and I had never once pulled a real file out of it under pressure. I knew, in the abstract, that an untested backup is just a directory you are paying to store. I just kept not testing it, the way you keep not going to the dentist.

What fixed it was making the test cheap and automatic instead of heroic and occasional. The mistake I had been making was imagining the test as a full disaster recovery: rebuild the host, restore everything, prove the whole thing end to end. That is so much effort that it never happened. So now the monthly job is smaller and dumber. A cron job picks a handful of known files, restores them to a scratch directory, and checksums them against a manifest I keep separately. If the checksums match, I get nothing. If they do not, I get a loud email.

restic restore latest --target /tmp/restore-test \
  --include /etc --include /srv/important
sha256sum -c /opt/backup-manifest.sha256 || \
  mail -s "RESTORE TEST FAILED" [email protected] < /dev/null

That is the whole trick. It is not a real disaster drill and it does not pretend to be. But it exercises the actual path: the repository unlocks, restic reads the snapshot, the data comes back byte-for-byte. The first time it ran it found a repository I had locked and forgotten months earlier, which would have been a wonderful thing to discover during an actual emergency rather than on a quiet Monday.

Twice a year I still do the big one: stand up a fresh VM, restore properly, boot it. But the monthly checksum job is what turned my backups from something I believed in into something I have evidence for. The difference between those two is the entire job.