a year with async/await, and the rewrite I do not regret

A screen of Rust source mid-refactor

async/await stabilised in Rust 1.39 back in November last year, and I have now spent the better part of a year living with it in a real service rather than a toy. This is the report I wish I had read before I started: not the announcement, not the tutorial, but the view from the far side of a rewrite.

What we were rewriting from

The service is a fairly ordinary network thing. It accepts connections, fans requests out to a handful of backends, aggregates the answers, and replies. Before async/await, that meant futures 0.1, combinators, and a great deal of and_then. It worked. It was also, and I say this with affection, a write-only codebase. Here is the shape of the old style, lightly anonymised:

fn handle(req: Request) -> impl Future<Item = Response, Error = Error> {
    fetch_user(req.user_id)
        .and_then(|user| {
            fetch_prefs(user.id)
                .and_then(move |prefs| {
                    fetch_content(prefs.feed)
                        .map(move |content| build_response(user, prefs, content))
                })
        })
}

Read that and tell me where the error from fetch_prefs goes. You can work it out, but you have to work it out, every time. And the moment you needed a value from an early step inside a later closure, you were threading it through move closures by hand, nesting deeper with each dependency. Borrowing across a combinator boundary was its own special misery. We had whole functions whose only job was to carry a tuple of intermediate values down the chain so the final map could see them.

What it looks like now

Here is the same logic after the port:

async fn handle(req: Request) -> Result<Response, Error> {
    let user = fetch_user(req.user_id).await?;
    let prefs = fetch_prefs(user.id).await?;
    let content = fetch_content(prefs.feed).await?;
    Ok(build_response(user, prefs, content))
}

That is the whole pitch, really. It reads like the synchronous code it replaced. The ? operator does the error propagation that used to be a forest of and_then/map_err. The intermediate values are just locals in scope, no manual threading. When I needed to add a step that depended on both user and content, I added a line. In the old world that was a refactor of the closure nesting.

The first time I ported a gnarly function and the diff was shorter and clearer, I genuinely sat back and grinned. It does not happen often that a language feature makes your existing code smaller and easier at the same time.

A diagram of the request fan-out the service performs

The parts that were not free

I do not want to oversell it, because the rewrite was not a weekend. A few things bit.

The runtime question is real. There is no async runtime in the standard library; you bring your own. We are on Tokio, and the migration meant moving from the 0.1 ecosystem to the 0.2 ecosystem, which touched more than just our code: every library that did I/O had to have an async-ready version, and not all of them did on the same timeline. A chunk of the calendar time was waiting for, or patching, dependencies rather than writing our own logic.

Then there is the Send and lifetime situation across an .await point. The compiler has to assume anything held across an await might be moved to another thread, so a Rc or a non-Send guard that was fine in synchronous code suddenly will not compile inside an async function. The errors are accurate but they arrive in a lump, and the first day of the port was mostly me learning which of my types were quietly not Send and why.

Holding a lock across an await is the trap everyone falls into once. It compiles, it runs, and then under load it deadlocks or serialises everything, because you are holding a mutex across a suspension point where the task can be parked for an unbounded time. The rule I now repeat to myself: do the locked work, drop the guard, then await. If you find yourself wanting to await while holding a MutexGuard, stop and restructure.

Cancellation also changed shape. A future that is dropped is cancelled, and with the combinator style you mostly did not think about partial progress. With async functions it is easier to write code that does several awaits in sequence and assumes they all happen, then be surprised when the task is dropped after the second one. It is not worse than before, it is just that the ergonomics tempt you into longer sequences, and longer sequences have more places to be cut short.

Was it worth it

Yes, plainly. The maintainability gain is not a nice-to-have; it changed who could work on the code. Before, the request path was a thing two of us understood and everyone else avoided. After the port a newer colleague added a feature to it in their first week, and the review was about the feature, not about how the futures plumbing worked. That is the real return.

Performance was a wash, which is exactly what I hoped for. We did not do this for speed. The shape of the work is the same; it is the same Tokio reactor underneath doing the same epoll. I measured before and after and the throughput and tail latency were within noise. If anything the new code gave us fewer accidental serialisation points, because the explicit .awaits made it obvious where we were waiting and where we could let things run concurrently with join!.

If you are sitting on a 0.1 futures codebase and wondering whether to make the jump: the language side is ready and genuinely lovely, the ecosystem side will cost you some waiting on dependencies, and the lock-across-await footgun is the one thing to teach the whole team on day one. I would do it again without hesitation, and I have already stopped reaching for combinators by reflex, which a year ago I would not have believed.