The service wasn't slow exactly. It just used far more CPU than it had any right to. A small Go API, mostly reading from Postgres and serving JSON, sitting at 70% of a core under a load that should have left it idle. Nobody was complaining yet, but the autoscaler was, and that costs money.
My instinct, always wrong, was that it was the database layer. It's always the database layer in my head. So before I changed anything I did the boring correct thing and turned on the profiler. Go makes this nearly free: import net/http/pprof, expose it on an internal port, and pull thirty seconds of CPU profile off a running instance.
go tool pprof -http=:8080 http://localhost:6060/debug/pprof/profile?seconds=30
That opens the flamegraph in a browser. The width of each frame is time spent. You read it top down and look for something fat that you didn't expect.
The widest frame, by a comfortable margin, was time.Parse. Not SQL. Not JSON marshalling. Date parsing, accounting for nearly a third of the CPU the process was burning.
It took me a minute to even believe it. Then I went looking, and there it was: a logging middleware that, for every single request, parsed a timestamp out of an upstream header into a time.Time so it could compute a request age for a metric. time.Parse with a custom layout string is not cheap, it allocates, and we were calling it on the hot path of every request whether or not anyone ever looked at that metric. Under load that's tens of thousands of parses a second, each one doing layout matching it didn't need to.
The fix was almost embarrassing. The upstream header was a Unix epoch in milliseconds, an integer. We were turning it into a formatted string somewhere upstream and then parsing it back. Skipping the round trip and doing a single strconv.ParseInt dropped that frame off the flamegraph entirely. CPU at the same load went from 70% of a core to about 25%. The autoscaler relaxed. Nobody noticed, which is the correct outcome for performance work.
The lesson isn't about date parsing. It's that I'd have bet good money on the database, started tuning queries, and found nothing, because the hot path was in a place I'd never have thought to look. The profiler doesn't have a theory. It just tells you where the time went, and it's almost never where you'd guess. Measure first. I keep relearning this, and I keep being surprised, which is its own kind of lesson.