I had a job that had been nagging me for months. Our backup box keeps dated snapshot directories, one per day, and nobody had ever written the thing that prunes them. Keep the last fourteen dailies, one a week beyond that, one a month beyond that. The classic grandfather-father-son retention you've all written badly at least once. There was a shell script. It was wrong in a way that only showed up when a month boundary landed on a Sunday, and by the time I'd stared at the date arithmetic for twenty minutes I'd decided I was going to write it in Rust instead. Partly because the bug was a parsing-and-dates bug and I wanted real types. Partly, if I'm honest, because I'd been looking for an excuse.
So: was it worth it? Mostly yes, and I'll show you the bits that made me say so and the bits that made me sigh.
the actual problem
The hard part of retention isn't deleting files, it's deciding which ones to keep. The directories look like snap-2017-06-30, one per day going back about a year. I want to walk them, parse the date out of each name, bucket them by the retention rules, and then everything not in a bucket gets removed. The shell version did this with ls, cut, sort and a heap of hope. The dates were strings the whole way through, so "is this the first snapshot of its week" was a question I was answering with awk and prayer.
In Rust the first thing I did was stop having strings. Parse once, at the edge, into a real date, and then the rest of the program reasons about dates.
fn parse_snapshot(name: &str) -> Option<(NaiveDate, String)> {
let date_part = name.strip_prefix("snap-")?;
let date = NaiveDate::parse_from_str(date_part, "%Y-%m-%d").ok()?;
Some((date, name.to_string()))
}
chrono does the date work, and NaiveDate gives me iso_week() and month() and ordering for free, which is precisely the arithmetic the shell script kept getting wrong. The Option return means a directory that doesn't match the pattern (someone's stray lost+found, a half-finished snapshot) just falls out of the iterator rather than blowing up. That alone would have caught the original bug.
where the types paid off
The bucketing logic is the heart of it, and this is where having real values rather than strings stopped being a nicety and started being the point. I sort the snapshots newest first, then walk them deciding what to keep:
let mut keep = HashSet::new();
let mut seen_weeks = HashSet::new();
let mut seen_months = HashSet::new();
for (i, (date, name)) in snaps.iter().enumerate() {
if i < 14 {
keep.insert(name.clone());
} else if seen_weeks.insert(date.iso_week().week()) {
keep.insert(name.clone());
} else if seen_months.insert(date.month()) {
keep.insert(name.clone());
}
}
HashSet::insert returns whether the value was new, so "is this the first snapshot I've seen for this week" becomes one honest expression instead of a tangle of remembered state. I'm not claiming this is clever. I'm claiming it's the kind of thing that's easy to write correctly when iso_week() is a method on a date and not a substring you carved out yourself.
The compiler also made me handle the empty directory and the unreadable directory and the entry that's a file not a folder. In the shell version every one of those was an unconsidered crash waiting for a bad day.
where it fought me
Two things, same as every time I do this, so at least it's consistent.
The first was dependencies and build time. A do-nothing binary that depends on chrono is not a fast compile, and the edit-compile-run loop on a tool you're testing against a real directory tree drags. I ended up with a tiny fixture tree of empty dated folders and a --dry-run flag so I could iterate without a full cargo build between every poke. The dry-run flag was the right call regardless: a pruning tool you can't run safely in print-only mode is a tool nobody will trust enough to actually run.
The second was that I over-engineered the error handling for about an hour before catching myself. This is a tool that walks one directory and deletes some folders. It does not need a seven-variant error enum with From impls for the world. I deleted all of it and used a single boxed error and ?, and the program got shorter and no worse. Rust makes thorough error handling so easy to reach for that you can forget to ask whether this particular fifty-line program has earned the thoroughness. It hadn't.
the accounting
The shell script was about 40 lines and wrong. The Rust version is around 180 lines and, as far as I can tell after a fortnight of watching it, right. That's a big multiplier on the line count, and I want to be fair about it: a chunk of those extra lines are the dry-run mode, the pattern that skips junk directories, and actually checking that a delete succeeded. None of which the shell version did. So it's not 40 lines of logic becoming 180 lines of the same logic. It's 40 lines of optimism becoming 180 lines that handle the cases that were quietly going to bite me.
Was it worth it for a backup pruner that runs once a day on one machine? For the correctness, yes, easily, because the failure mode of the old one was deleting the wrong snapshots, and that's the one job a backup tool must not get wrong. For the deployment story, also yes: it's a single static binary on a box that has no business growing a Python or Ruby runtime just to do date maths.
Would I do it again for something genuinely trivial, a five-line glue script? No. Bash still wins that and it isn't close. But the moment a script's bug is a logic-and-types bug rather than a typo, the moment getting it wrong is expensive, Rust keeps earning its afternoon. The borrow checker, for the record, barely came up. This was all about having real dates instead of strings, and that's a much more boring and much more useful endorsement.