I had a service eating more CPU than it had any right to, and I was certain I knew where. The request parser, obviously. It always is. I'd spent half a day staring at that code, finding nothing, getting cross.
So I gave up guessing and did what I should have done first:
perf record -F 99 -g -p $(pidof myservice) -- sleep 30
perf script | stackcollapse-perf.pl | flamegraph.pl > out.svg
The parser was a sliver. The fat tower in the middle was a logging helper that formatted a context string on every request, whether or not the line was ever emitted. Roughly 18% of CPU, building messages we threw away because the level was set to warn in production.
The fix was embarrassing. Make the formatting lazy so it only happens when the line will actually be logged. One commit, CPU down by about a sixth, and a small lesson I keep relearning: I am consistently wrong about where the time goes. The profiler isn't. Stop arguing with it.