Ramblings of an aging IT geek
← Ramblings of an aging IT geek
rust

how we handled rust errors before anyhow saved us

A look back at Rust error handling in the error-chain and failure days, the Box<dyn Error> dance, and why anyhow and thiserror felt like such a relief.

A screen of Rust source code

These days you reach for anyhow in application code, thiserror in libraries, and the question of error handling is more or less settled. It is easy to forget that this was, for years, the single most argued-about corner of Rust. I have been digging through some of my older crates this week and the archaeology is genuinely jarring. So here is a short field guide to how we coped before the modern tooling, written partly so I remember and partly so newer Rust people understand why the old hands sigh with relief when they type ? and it just works.

the std-only era: Box

Before any of the libraries, you had the standard library and not much else. The honest, dependency-free way to return "something went wrong and I do not want to enumerate every possibility" was a boxed trait object:

use std::error::Error;

fn load_config() -> Result<Config, Box<dyn Error>> {
    let raw = std::fs::read_to_string("config.toml")?;
    let cfg = parse(&raw)?;
    Ok(cfg)
}

This works because the ? operator will call From to convert each underlying error into Box<dyn Error>, and most error types implement Error. It is fine for a quick binary. The trouble starts the moment you want context. A Box<dyn Error> that says No such file or directory tells you nothing about which file, in which function, while doing what. You end up manually wrapping with map_err and string formatting, and the result is verbose and inconsistent across a codebase.

There was also the matter of Send + Sync. The moment you wanted to move an error across a thread boundary, or return one from anything touching async, plain Box<dyn Error> was not enough and you needed Box<dyn Error + Send + Sync + 'static>, which is a mouthful you typed many, many times.

error-chain, and the macro maze

The first real attempt to make this pleasant was error-chain. It gave you a macro that generated an error type, conversions from foreign errors, and, crucially, the idea of a chain: an error that carries the error beneath it, so you could see the whole causal trail.

error_chain! {
    foreign_links {
        Io(std::io::Error);
        Toml(toml::de::Error);
    }
    errors {
        InvalidConfig(field: String) {
            description("invalid configuration")
            display("invalid configuration field: {}", field)
        }
    }
}

When it worked it was lovely. You got chain_err to add context as the error bubbled up, and a backtrace if you asked for one. But the whole thing was built on a large, opaque macro. When something went wrong inside it the compiler errors were baffling, pointing at generated code you could not see. It also rather wanted to own your entire error story; mixing error-chain with hand-written error types was awkward.

Close-up of code on a dark editor theme

failure: the great interregnum

Then came failure, which for a good while was the recommended answer and shipped in a lot of well-known crates. It introduced the Fail trait as a replacement for std::error::Error, fixing some genuine shortcomings of the std trait at the time, and a failure::Error type that was essentially a better Box<dyn Error> with a backtrace baked in.

use failure::{Error, ResultExt};

fn load_config() -> Result<Config, Error> {
    let raw = std::fs::read_to_string("config.toml")
        .context("reading config.toml")?;
    let cfg = parse(&raw).context("parsing config")?;
    Ok(cfg)
}

That .context(...) was the good idea everyone remembers. It read well and it produced useful output. The catch was that failure lived in a parallel universe to the standard library. Because it had its own Fail trait rather than building on std::error::Error, it never composed cleanly with code that used the standard trait, and the ecosystem ended up partly split. It was a brave experiment that taught everyone what good looked like, and then it was quietly superseded.

why anyhow and thiserror won

The thing that resolved all this was realising the problem had two halves, and they wanted opposite tools.

In a library you want a precise, enumerable error type so your callers can match on it and make decisions. That is thiserror: a derive macro that writes the boilerplate for a normal enum of errors, with Display and the From conversions and the #[source] plumbing, while leaving you a perfectly ordinary type that implements the standard std::error::Error. No parallel trait, no macro maze.

#[derive(thiserror::Error, Debug)]
pub enum ConfigError {
    #[error("could not read {path}")]
    Io { path: String, #[source] source: std::io::Error },
    #[error("invalid field: {0}")]
    InvalidField(String),
}

In an application you do not care about enumerating anything; you want to add context cheaply and bubble everything to main. That is anyhow, which is the spiritual successor to failure::Error but built squarely on the standard Error trait so it composes with everything:

use anyhow::{Context, Result};

fn load_config() -> Result<Config, anyhow::Error> {
    let raw = std::fs::read_to_string("config.toml")
        .context("reading config.toml")?;
    parse(&raw).context("parsing config")
}

Same .context() ergonomics that failure got right, none of the ecosystem split, and the two crates interoperate: a thiserror enum from a library drops straight into an anyhow::Error in the application because both speak the standard trait. The division of labour is the whole insight. Libraries are precise, applications are pragmatic, and ? carries the result across the boundary without ceremony.

Looking back through the old code, the through-line is obvious: every one of these tools was chasing the same two things, context and convenient propagation. It just took a few iterations, and one brave detour through a parallel trait, before the answer split cleanly in two. If you started Rust in the last couple of years and anyhow plus thiserror is all you have ever known, count yourself fortunate. The boxed Send + Sync trait objects are still down there holding it all up, but you very rarely have to look.