Ramblings of an aging IT geek
← Ramblings of an aging IT geek
linux

how much swap does a homelab actually need

After years of guessing, I settled on a swap policy for my homelab boxes based on what actually happens when memory runs out.

A Linux terminal showing memory statistics

The swap debate is one of those arguments that never quite dies. Some people run zero swap on principle. Some still cargo-cult the old "twice your RAM" rule from a machine that had 256MB. I've held all of these positions at various points and been wrong every time, so I finally sat down and decided on something for my homelab that I could actually justify.

The thing that changed my mind was watching what happens without swap. A node with no swap doesn't run faster under pressure. It just hits the OOM killer sooner and with less grace. The kernel can't push cold anonymous pages out to make room for the page cache, so it thrashes the cache instead, which is the bit you actually wanted fast. On a box that's mostly idle ZFS plus a handful of containers, that's the wrong trade.

A rack of homelab servers

So here's where I landed. Every node gets a small amount of real swap, enough to let the kernel evict genuinely cold pages but not enough to paper over a real leak. On the machines with NVMe that's a 4GB swapfile; on the spinning-rust boxes I keep it to 2GB because you do not want to be servicing page faults from a 7200rpm disk. Then I tune the knobs to match:

vm.swappiness = 10
vm.vfs_cache_pressure = 50

swappiness = 10 doesn't mean "use 10% swap", it's a bias. Low values tell the kernel to prefer reclaiming page cache over swapping out anonymous memory, but it can still swap when it's genuinely the right call. Setting it to zero entirely is the trap: on modern kernels that's nearly "never swap until you're desperate", which puts you right back to the abrupt OOM behaviour I was trying to avoid.

The other half is earlyoom. Swap buys you grace, but grace under memory pressure can also mean a box that's alive enough to answer ping and dead enough to be useless. earlyoom watches available memory and kills the worst offender before the kernel's own OOM killer wades in with its less predictable scoring. On a homelab that's usually some runaway Java thing or a container I misconfigured, and I'd much rather lose that than have the whole node lock up while it swaps itself to death.

For anything I actually care about staying up, I also set per-service memory limits in the unit files. MemoryMax with a sensible MemoryHigh below it gives the cgroup a chance to reclaim before it gets killed outright. Swap at the system level plus limits at the service level means a misbehaving app degrades instead of taking its neighbours with it.

Is this optimal? Probably not. But it's defensible, it's consistent across every box, and I've stopped having the argument with myself. That last part is worth more than the few hundred megabytes.