Ramblings of an aging IT geek
← Ramblings of an aging IT geek
performance

how much does crossing into the kernel actually cost

A quick measurement of syscall overhead with strace and a tight benchmark loop, and why batching writes matters.

A latency graph on a monitoring dashboard

People throw around "syscalls are expensive" without a number attached, so I went and got one. The job was a logging path that called write() once per line, and I had a hunch the line count, not the byte count, was the problem.

A tight loop doing a million one-byte write()s to /dev/null clocked in at well under a microsecond each on the box I tested, a few hundred nanoseconds of overhead per call once you strip the work out. That sounds tiny. It is tiny, until you do it a few hundred thousand times a second, at which point you're spending whole CPU seconds doing nothing but the round trip into the kernel and back. strace -c on the real process confirmed it: the wall clock was dominated by write, not by anything I'd written myself.

The fix was the obvious one. Buffer the lines and flush in chunks, so one write() carries a few kilobytes instead of forty bytes. Same total data, a fraction of the calls, and the syscall column on the strace summary collapsed. The kernel was happy to take it all in one go.

None of this is news, buffered I/O is older than I am, but it's worth having the actual figure in your head. A syscall isn't free and it isn't ruinous. It's a few hundred nanoseconds, and the only time that matters is when you've decided to pay it a million times a second for no reason.