A box that should have been doing almost nothing was sitting at a steady five or six percent CPU. Not alarming, nothing was on fire, but it was the wrong kind of nothing. This host runs one small service that handles a request every few seconds. It should idle near zero and spike briefly when work arrives, not hum along at a constant low burn forever.
top told me the obvious thing: the CPU was busy and no single process owned it. The usage was spread thin and mostly in sys, which is the tell. When the time is going into the kernel rather than any one userspace process, the question stops being "which program" and becomes "which syscall, on whose behalf".
So, perf top. It samples where the CPU actually is, kernel symbols included, and ranks the hot functions live:
perf top -g
The top of the list was not application code at all. It was timer and polling machinery in the kernel, the kind of symbols you see when something is waking up far more often than it has any reason to. With the call graph turned on (-g), the chain led back to my service: it was sitting in a tight-ish loop with a poll timeout I had set to one millisecond years ago, "just to be responsive", and then forgotten.
One millisecond means waking a thousand times a second to check for work that arrives every few seconds. The work itself was trivial. The cost was entirely the waking up. A thousand context switches and timer reprogrammings per second, on a box that processes a handful of real events a minute, is a beautifully pointless way to spend a CPU.
The fix was to stop polling on a timer and block on the actual event source, so the thread sleeps until there is genuinely something to do and the kernel wakes it once, when it matters. CPU dropped to where it should always have been, flat near zero with brief honest spikes.
The lesson I keep relearning: a low, constant background load is worth chasing precisely because it is low. It is rarely the work. It is almost always something waking up too often to do nothing, and perf top will point straight at it in about ten seconds, which is roughly nine and a half seconds faster than I would have guessed it.