the runaway process that cgroups v2 quietly contained

A terminal showing process accounting

There's a particular kind of bad afternoon where one process decides to allocate all the memory on a box, the OOM killer wakes up grumpy, and it picks something important to murder instead of the actual culprit. I've had that afternoon more than once. On a recent one I finally did something about it properly, and cgroups v2 made it almost boring.

The job was a batch importer that occasionally got fed a pathological input and ballooned past 30GB. On the old setup that meant the kernel started killing processes more or less at random under pressure, and sshd was as likely a victim as the importer. Not ideal when you're trying to get back in to see what happened.

The fix was to put the importer in its own systemd slice with a hard memory ceiling. On cgroups v2 that's a single directive:

# /etc/systemd/system/importer.service.d/limit.conf
[Service]
MemoryMax=8G
MemoryHigh=6G

MemoryHigh throttles the process as it approaches the limit, reclaiming aggressively and slowing it down. MemoryMax is the wall: cross it and only this cgroup's tasks get OOM-killed, not the rest of the host. The first time the pathological input came back, the importer died cleanly, alone, with a clear log line, and everything else carried on. sshd lived. I lived.

The thing I appreciate about the unified hierarchy is that there's one tree to reason about, not the v1 tangle of separate controllers that never quite agreed on what a "group" was. systemd-cgls shows you the whole thing, systemctl status importer shows the memory accounting inline, and the blast radius of a runaway is now exactly the slice you put it in.