Ramblings of an aging IT geek
← Ramblings of an aging IT geek
linux

how much swap, and the answer i finally stopped arguing about

After years of cargo-culting swap sizing rules, I settled on a deliberate policy for my homelab: a small swap, a tuned swappiness, and zram where it earns its keep.

A Linux server terminal showing memory stats

The swap question is one of those topics where everyone has a confident answer and most of them are repeating something they read in 2006. "Twice your RAM." "No swap, RAM is cheap." "Swap is always bad, it means you're thrashing." I've held most of these positions at one time or another, usually loudly, and I've finally sat down and decided what I actually do across my homelab rather than re-litigating it every time I provision a box.

Here's the short version: a small fixed swap on disk, swappiness turned down but not off, and zram on the boxes that are memory-constrained. The reasoning is the interesting part.

Why not "no swap"

The seductive argument is that swap means slowness, RAM is plentiful, so just don't have any and let the OOM killer handle the rest. I ran a couple of machines this way for a while. The problem is that swap on Linux is not only for "you ran out of RAM". The kernel will, given somewhere to put them, page out memory pages that are genuinely cold, allocated once at startup and never touched again, and reclaim that physical RAM for page cache and pages that are actually in use. With no swap at all, those cold pages have nowhere to go and sit in precious RAM doing nothing.

The other thing "no swap" gets you is a worse failure mode under memory pressure. Without swap, when you run low, the kernel's only lever is to drop caches and then invoke the OOM killer, and the OOM killer's judgement about which process to shoot is not something I want to rely on at 3am. A little swap gives the system room to breathe and page out cold stuff before it reaches for the gun.

A rack and server hardware in the homelab

Why not "loads of swap"

The flip side is the old "twice your RAM" rule, which made sense when RAM was measured in megabytes and you genuinely might need to hibernate the lot to disk. On a server with 32GB of RAM, provisioning 64GB of swap is just buying yourself a slower, more confusing way to fall over. If a process leaks until it's chewed through 32GB of RAM and is now reaching into 64GB of disk-backed swap, the machine isn't surviving, it's dying slowly and dragging everything else down with it as the disk thrashes. I would rather that runaway hit a wall and get OOM-killed than be allowed to grind for twenty minutes first.

So large swap doesn't prevent the bad outcome, it just delays and worsens it. The exception is genuine hibernate-to-disk on a laptop, where you do need swap at least the size of RAM. None of my homelab boxes hibernate, so that doesn't apply.

A closer view of homelab server internals

What I actually run

A small, deliberate swap. On most boxes that's a few gigabytes, regardless of how much RAM they have. Enough to absorb cold pages and give the kernel somewhere to stage things, not enough to let a leak run for ages.

Swappiness turned down. The default vm.swappiness of 60 is tuned for a desktop where you'd rather keep file cache hot. On a server I want the kernel to prefer keeping application memory resident and only swap when it's worthwhile:

# /etc/sysctl.d/99-swap.conf
vm.swappiness = 10
vm.vfs_cache_pressure = 50

Ten, not zero. Setting swappiness to zero is the trap I see people fall into thinking they've "disabled" swap; what they've actually done is told the kernel to avoid swap so aggressively that it'll OOM-kill rather than page out a cold page, which is the very behaviour I was trying to avoid by having swap in the first place.

zram where RAM is tight. On the smaller nodes, a couple of older NUCs with not much memory, I run zram, which is a compressed swap device that lives in RAM itself. It sounds like a contradiction, swapping into memory, but it works: cold pages get compressed at roughly 2:1 or 3:1 and the effective capacity goes up without ever touching the disk. It's much faster than swapping to an SSD, let alone to spinning rust, and it's the single change that made the constrained boxes feel less fragile.

zramctl --find --size 2G --algorithm zstd
mkswap /dev/zram0
swapon --priority 100 /dev/zram0

The higher priority means the kernel reaches for compressed RAM-swap before the on-disk swap, which is exactly what you want.

The honest conclusion

The reason the swap debate never ends is that the right answer depends on the workload, and most arguments are conducted with the workload left unstated. For a database box you'd tune differently again. But for a general-purpose homelab running a pile of containers and the odd VM, a small swap with low swappiness, plus zram on the memory-starved nodes, has been quietly correct for long enough that I've stopped second-guessing it.

Swap isn't a moral failing and it isn't a magic safety net. It's a slow tier of memory, and like any slow tier, the skill is in deciding how much of it to have and how eagerly to use it. I've made my decision. I'd encourage you to make yours on purpose, rather than inheriting it from a forum post older than the hardware.