A box that does almost nothing was sitting at a steady 100% on one core. No load on it to speak of, a handful of idle daemons, and yet top showed a core pinned and %sy high. System time, not user time, which is the interesting bit: the CPU wasn't running my code, it was running the kernel's. top will happily tell you a core is busy. It won't tell you what it's busy doing. For that you want perf top.
perf top is the sampling profiler's blunt instrument. It interrupts the CPU many times a second, records the instruction pointer, and shows you a live, sorted list of the hottest symbols across the whole machine, kernel and userland together. No instrumentation, no restart, no recompile. You run it on the live box and watch where the time is actually going.
$ perf top -g
The -g adds call graphs so you can see who's calling the hot function, not just the function itself. Within a second or two the answer was on screen, and it was a kernel symbol sitting at the top with everything else a rounding error.
A single kernel function eating a whole core on an idle box almost always means spinning on something: a lock held too long, a timer firing far too often, or a path being walked in a tight loop when it should be sleeping. The call graph pointed at the subsystem, and from there it was a short hop to a misconfigured daemon polling far more aggressively than anyone intended. The fix was a config line. The diagnosis was the hard part, and perf top turned it from a guess into a thirty-second observation.
What I keep coming back to is how much faster this is than reasoning about it. I could have spent an afternoon reading through what each daemon does, forming a theory, testing it, being wrong. Instead I asked the kernel directly: what are you spending cycles on, right now? It told me. The first rule of performance work is don't theorise about the bottleneck, measure it, and perf top is about the lowest-effort way I know to measure where a single busy machine is spending its time.
One caveat worth stating: you need the kernel symbols available for the names to be meaningful, otherwise you get a wall of hex addresses instead of function names. On most distributions the symbols are already there or a debuginfo package away. Get that sorted once and perf top becomes the first thing I reach for whenever a machine is busy and won't tell me why.