Ramblings of an aging IT geek
← Ramblings of an aging IT geek
rust

nom finally made parsing feel less like a chore

A first proper go at writing a parser with nom in Rust, and why building small parsers out of small parsers clicked.

Code on a screen

I needed to parse a small config-ish format this week, the sort of thing that is too structured for a regex and too trivial to justify a grammar file and a code generator. In the past I would have hand-rolled a state machine and regretted it by the third edge case. This time I reached for nom, and I am a convert.

The thing that clicked is that nom parsers are just functions, and you build big ones out of small ones. A parser for a number, a parser for whitespace, a parser for a quoted string, and then you combine them. Each piece is tiny enough to test on its own, and the combinator that glues them together reads roughly like the structure of the thing you are parsing.

fn key_value(input: &str) -> IResult<&str, (&str, &str)> {
    separated_pair(
        identifier,
        delimited(space0, char('='), space0),
        value,
    )(input)
}

The bit I tripped over was the error and remainder handling. Every parser returns the rest of the input it did not consume, and you thread that through. Once that lands it stops feeling like magic and starts feeling like plumbing, which is the right amount of magic. The type signatures are heavy at first glance, but they are the same shape every time, so they fade into the background quickly.

It is not the tool for a full programming language; I would still reach for a real grammar there. For a small, well-defined format, though, nom turned an afternoon of fiddly string-slicing into about forty lines I actually trust. That is a good trade.