Ramblings of an aging IT geek
← Ramblings of an aging IT geek
debugging

the leak was a map i forgot to empty

A slow-growing memory leak in a long-running service turned out to be a cache map that only ever had keys added to it, never removed.

A terminal showing a process memory graph

A service had been creeping upward in memory for days. Not a spike, a slope: a few megabytes an hour, dead straight, until the box started swapping and the alerts went off. The straightness was the tell. A leak that grows in lockstep with traffic is usually something accumulating per request and never letting go.

It was a map. Of course it was a map.

The code kept a cache keyed by request ID, populated on the way in so a later stage could look up some context. The lookup happened. What never happened was the delete afterwards. Every request added an entry; nothing ever removed one. The map just grew, quietly, forever, holding onto the context objects long after anyone needed them. Classic.

cache[reqID] = ctx
// ... later
val := cache[reqID]
// the delete(cache, reqID) that should live here does not exist

The fix was one line. Add the delete, or better, stop using an unbounded map as a cache at all and reach for something with eviction. I went with the one line for now and a ticket for the proper fix.

The lesson isn't "don't leak memory", everyone knows that. It's that a perfectly linear memory graph is a gift. It told me the leak was tied to a counter I could see, which meant something was accumulating once per unit of that counter, which narrowed forty files down to about three. The graph did most of the debugging before I read a single line of code.