Ramblings of an aging IT geek
← Ramblings of an aging IT geek
rust

i rewrote a 40-line shell script in rust and i'd do it again

Porting a small log-munging shell script to a Rust CLI, and an honest accounting of whether the effort paid off.

A code editor showing Rust source

I had a shell script. It read a log, pulled out a field, counted things, and printed a summary. About forty lines of awk and sort | uniq -c, the kind of thing that works fine until someone feeds it a file with a quoted comma in it and the whole edifice quietly produces wrong numbers. I'd been meaning to learn Rust properly, so I rewrote it as a small CLI. The honest question afterwards is whether that was a good use of an evening.

The setup was nicer than I expected. cargo new gives you a project that builds, clap for argument parsing has become genuinely pleasant, and csv plus serde meant the quoted-comma problem that started all this just went away. The compiler was strict in the way people warned me about, but the errors were the most polite and useful I've had from any toolchain. It told me where I'd borrowed something I shouldn't, suggested the fix, and was usually right.

The borrow checker did make me think harder than awk ever has. There was a stretch on the second evening where I wanted to hold a reference to a field whilst also mutating the map I was counting into, and the compiler simply would not let me. My first instinct was that it was being pedantic. My second, after I'd worked out what it was protecting me from, was that it had caught a real aliasing mistake that the shell version would have papered over by being slow and stringly-typed. That's the trade: you pay the thinking up front instead of in production.

A close-up of source code on screen

The counting itself ended up as a HashMap<String, u64> and an entry(...).or_insert(0), which reads almost as plainly as the sort | uniq -c it replaced, except it's parsing real CSV underneath rather than splitting on commas and hoping. The whole thing is maybe ninety lines, builds to a release binary in a few seconds, and the only runtime dependency is libc.

Was it worth it? For this script, on its own, no. The shell version worked. But the binary is a single file I can copy to a box with nothing installed, it's about thirty times faster on the big logs, and it handles the malformed input I was previously ignoring. More to the point, I now have a small, real thing I understand end to end, which is worth far more than the script. The next little tool will take an hour instead of an evening, and that's the actual payoff. Learning a language on a throwaway script is cheap insurance for the day you need it on something that matters.