Ramblings of an aging IT geek
← Ramblings of an aging IT geek
rust

async/await in rust, and the rewrite i couldn't resist

Rewriting a small threaded Rust service onto async/await with tokio, what got cleaner, and what the borrow checker had to say about it.

A screen full of source code

Async/await has been stable in Rust for a while now, the ecosystem has caught up, and I finally went back and rewrote a small service I had built the old threaded way. The short version: it was worth it, the result is genuinely cleaner, and I spent more time than I would like to admit explaining to the borrow checker what a future is allowed to hold.

The service is unremarkable. It polls a handful of HTTP endpoints, does a bit of work on each response, and writes the results somewhere. The kind of thing every homelab accumulates. The original version spawned a thread per endpoint with std::thread, shared state behind an Arc<Mutex<_>>, and blocked on reqwest's blocking client. It worked. It was also slightly heavier than it needed to be, and adding a new endpoint meant thinking about thread lifetimes, which I resented.

why rewrite it at all

I want to be honest that the service was fine. Nobody asked for this. The threaded version handled its dozen endpoints without breaking a sweat, and threads-and-mutexes is a perfectly respectable way to write Rust. If you have a working threaded program and you are reaching for async purely because it is the fashionable thing, stop and ask what you are actually buying.

What I was buying, in my case, was two things. First, the I/O is overwhelmingly waiting on the network, which is exactly the shape async is good at: lots of tasks that are idle most of the time. Second, I wanted to learn the modern idioms properly on something low-stakes before I met them somewhere they mattered. A small personal service is the right place to be confused.

A wall of colourful syntax-highlighted code

what got cleaner

The fan-out-and-collect pattern is where async earns its place. In the threaded version, kicking off all the requests and gathering the results meant spawning threads, handing each a clone of an Arc, joining the handles, and unpicking the results. With async it collapses into something you can read in one breath:

use futures::future::join_all;

async fn poll_all(endpoints: &[Endpoint]) -> Vec<Result<Reading, Error>> {
    let tasks = endpoints.iter().map(|e| fetch_one(e));
    join_all(tasks).await
}

async fn fetch_one(endpoint: &Endpoint) -> Result<Reading, Error> {
    let body = reqwest::get(&endpoint.url).await?.text().await?;
    parse_reading(&body)
}

All the endpoints are now in flight concurrently on a handful of OS threads managed by the runtime, rather than one OS thread each. For a dozen endpoints that difference is academic. For a few hundred it stops being academic. And the code says what it means: fetch each one, await them all, collect.

I used tokio as the runtime, because it is the one everything else in the ecosystem assumes, and reqwest with its default async client, which is the same library minus the blocking feature. The #[tokio::main] macro on main hides the runtime setup, which is fine until the day you need to configure the runtime yourself, at which point you find out what it was hiding.

what the borrow checker had to say

Here is where I paid my tuition. The threaded version held shared state in Arc<Mutex<State>> and that mostly worked because each thread locked, did its thing, and unlocked. Move to async and the rules shift in a way that is easy to walk into.

The trap is holding a std::sync::Mutex guard across an .await. A guard is not Send, the future that holds it stops being Send, and the moment you try to tokio::spawn that future the compiler tells you so in a wall of text about MutexGuard not being safe to send between threads. It is technically correct and initially baffling. The point it is making is real: if a task parks at an await point while holding a lock, another task on the same runtime could be blocked behind it indefinitely, and you have built a deadlock with extra steps.

There are two honest fixes. Either narrow the lock so the guard is dropped before any await:

let value = {
    let state = state.lock().unwrap();
    state.current.clone()
}; // guard dropped here, before the await below
do_something_async(value).await;

Or, if you genuinely need to hold state across awaits, reach for tokio::sync::Mutex, whose guard is designed to be held across await points. It is slower and you should not default to it, but it exists for exactly this case. I used the scoped-drop approach almost everywhere and the async mutex in one spot where the logic genuinely needed it.

The other thing that caught me was lifetimes in spawned tasks. tokio::spawn wants a future that is 'static, which means it cannot borrow local data; it has to own what it uses. So a lot of &endpoint became endpoint.clone() moved into the task. Coming from the borrow-everything threaded style this felt like a regression, and it is a small real cost: you clone more. In practice the things being cloned are cheap, and the alternative is wrestling with scoped spawns, which is a fight I chose not to have for a service this size.

was it worth it

For this particular program, the performance win is theoretical. A dozen network calls finish fast either way. What I actually got was code that is shorter, reads more directly, and will scale to far more endpoints without me thinking about thread lifetimes ever again. I also now understand the Send-across-await failure mode from the inside, having caused it three times in an afternoon, which is worth more than the rewrite itself.

If you have a threaded Rust program that is working, you do not need to do this. If you have I/O-bound work, a reason to learn, and an afternoon to spend arguing with the compiler about what your futures are allowed to carry, async/await is in genuinely good shape now. Just keep your locks short and do not hold a guard across an await. The compiler will not let you forget, but it is kinder to learn it on purpose.