I Benchmarked Trait Objects Against Generics So You Don't Have To

Rust code on a screen

Someone in a review told me to "use generics here, dynamic dispatch is slow". They were probably right that generics were the better call, but the reasoning bothered me, because "slow" without a number is just folklore. So I sat down with criterion and measured it, on the actual shape of code I write rather than a toy that the optimiser can see through.

The setup is a trait with one method that does a small amount of real work (a hash and a couple of arithmetic ops, enough that it can't be inlined away to nothing) and a collection I iterate over calling it a million times. One version stores Box<dyn Trait>, the other is monomorphised over a concrete type.

trait Score {
    fn score(&self, seed: u64) -> u64;
}

// dynamic
fn run_dyn(items: &[Box<dyn Score>], seed: u64) -> u64 {
    items.iter().map(|i| i.score(seed)).fold(0, u64::wrapping_add)
}

// static
fn run_gen<T: Score>(items: &[T], seed: u64) -> u64 {
    items.iter().map(|i| i.score(seed)).fold(0, u64::wrapping_add)
}

The numbers

On my machine (a fairly ordinary desktop, --release, criterion's defaults), over a million elements:

static dispatch: ~1.18 ms
dynamic dispatch: ~1.41 ms

So the generic version was about 20% faster on this workload. Real, repeatable, not noise. But look at the absolute figures. That's a quarter of a millisecond of difference across a million calls. If your hot loop is doing a million trait calls and a quarter of a millisecond matters, you already know it and you've already reached for generics. For the other 99% of code, this is comfortably below the threshold where anyone will ever notice.

A profiling chart on a laptop

The interesting part isn't the vtable indirection itself, it's what it prevents. The dynamic version can't inline score, so the optimiser is blind across that call boundary. With generics the whole thing monomorphises, the call inlines, and now the compiler can hoist, vectorise, and constant-fold across what used to be a function boundary. The 20% isn't the cost of one pointer chase, it's the cost of an optimisation wall.

So which should you reach for?

My honest answer after measuring: pick generics when the type is known at compile time and the call is in a hot path, because it costs you nothing and occasionally buys you a lot. Pick trait objects when you genuinely need heterogeneous collections or you want to keep compile times and binary size down, and stop apologising for it. The dispatch overhead is real and it is also, for almost everything you'll write, completely irrelevant.

The thing I'd push back on is the reflexive "dyn is slow" in code review. It's slower, by a measurable and usually unimportant amount. If you're going to make a performance argument, bring a benchmark. I did, and it mostly told me I'd been worrying about the wrong thing.