I keep writing the same kind of small Rust tool, and I keep making the same mistake in the first draft, so I may as well write it down where it can shame me later. This week's excuse was a directory full of rotated logs that nobody could eyeball, so I wanted a thing that walks a path and prints the total bytes per file extension, biggest first. Twenty minutes of work. It was, of course, not twenty minutes.
The mistake is always the same: I reach for String where I should reach for a real type, and I leave the output formatting until last when it wants to be a type from the start. I started accumulating sizes into a HashMap<String, u64> keyed on the extension, which is fine, and then started formatting bytes into "4.2 MB" with a free function called from three places. By the third call site I had a humanise function taking a u64 and returning a String, and a sort that compared the raw bytes but printed the formatted string, and the usual quiet anxiety that those two had drifted apart.
The fix is boring and it's the same fix every time. Sort on the number, format at the very edge, once, in the line that prints. Don't let the pretty string anywhere near the logic. entries.sort_by(|a, b| b.1.cmp(&a.1)) on the raw u64, then println! does the humanising and nothing else does. Obvious in hindsight, obvious in foresight too if I'd just listen to myself.
Was it worth it over a du and an awk one-liner? Honestly, no, not this one. du -ah | sort -h would have done. But it's a single binary I can drop on the backup box that has no awk worth speaking of, and it took twenty minutes, and I learned nothing except that I already knew the lesson. Which is its own kind of progress.