Ramblings of an aging IT geek
← Ramblings of an aging IT geek
linux

the day journald ate the root partition

How systemd-journald quietly filled a root disk, and the handful of settings that keep it bounded, fast and actually useful.

A rack of servers in a dim room

The page came in at an awkward hour, as they do: a small fleet of boxes had stopped doing useful work because / was full. Not nearly full. Full, zero bytes, the kind of full where even logging in to investigate is a fight because half the things you run want to write a temp file. The culprit, once I could see straight, was the journal. /var/log/journal had grown to several gigabytes per host and nobody had ever told it to stop.

This is one of those problems that's entirely self-inflicted and entirely avoidable, and I'd avoided thinking about it for years because journald's defaults are "fine" right up until they aren't. So this is the writeup I wish I'd read before that night.

first, see how bad it is

The single most useful command here is the one that tells you what the journal is actually costing you:

journalctl --disk-usage

On the offending hosts that came back with numbers north of 6GB, which on a 20GB root partition shared with everything else is a lot of log to be keeping around for no particular reason. If you want to know who's generating it, sort by the writing service:

journalctl --output=json --no-pager \
  | jq -r '._SYSTEMD_UNIT' \
  | sort | uniq -c | sort -rn | head

In my case one chatty application was logging a stack trace per request at a few hundred requests a second during an incident, which is its own bug, but the journal happily kept every line because I'd never told it not to.

A server's disk usage graph climbing

persistent versus volatile, and why it matters

By default many distributions ship journald in "auto" mode for storage. If /var/log/journal exists, logs are persistent and survive reboots; if it doesn't, they live in /run and evaporate on reboot. Persistent is what you want on a server, because logs that vanish when the box restarts are useless precisely when you need them. But persistent means bounded, or you get my bad night.

The settings live in /etc/systemd/journald.conf, or better, a drop-in so package upgrades don't stomp it:

# /etc/systemd/journald.conf.d/limits.conf
[Journal]
Storage=persistent
SystemMaxUse=1G
SystemKeepFree=2G
SystemMaxFileSize=128M
MaxRetentionSec=2week
MaxFileSec=1day

Reading those in order:

  • SystemMaxUse=1G is the headline. The journal will never use more than a gigabyte of disk per host, full stop. Old entries are rotated out to stay under it.
  • SystemKeepFree=2G is the safety net for the rest of the disk. Even if SystemMaxUse would allow more, journald backs off to leave at least 2GB free for everyone else. journald takes the more conservative of the two, which is exactly what you want.
  • SystemMaxFileSize caps individual journal files so rotation happens in sensible chunks rather than one enormous file you can't move.
  • MaxRetentionSec and MaxFileSec add a time dimension: nothing older than a fortnight, and start a fresh file every day so retention is granular.

Apply it without a reboot:

systemctl restart systemd-journald

reclaiming the space right now

Changing the config bounds future growth, but it won't instantly shrink an already-bloated journal to the new limit on every distro. You can force the issue:

# keep only the last 500M
journalctl --vacuum-size=500M

# or only the last week
journalctl --vacuum-time=7d

--vacuum-size and --vacuum-time delete archived journal files until the target is met. They don't touch the active file, so you can run them on a live system without losing the logs you're currently writing. On the incident night, a --vacuum-size=500M across the fleet bought back enough room to actually log in and think.

A second view of stacked server hardware

the bit people forget: rate limiting

Bounding total size is necessary but it treats the symptom. The cause that night was a service screaming thousands of lines a second, and journald's default rate limiting had been tuned generously enough that it didn't kick in. There are two knobs:

[Journal]
RateLimitIntervalSec=30s
RateLimitBurst=10000

That says: any single service may log up to 10,000 messages per 30-second window, and beyond that journald drops them and notes how many it suppressed. It's a per-service limit, so one misbehaving unit can't drown out the logs you actually need from everything else. Tune the burst to your busiest legitimate service and no higher. The suppression notice in the journal is itself a useful signal that something's gone shouty.

checking your work

After all this, verify rather than assume. journalctl --disk-usage should now sit comfortably under your cap. A quick way to confirm the config actually took:

systemd-analyze cat-config systemd/journald.conf

That prints the effective configuration with all drop-ins merged, so you can see that your SystemMaxUse won and nothing in /usr/lib is quietly overriding it.

None of this is clever. That's rather the point. journald is a genuinely good piece of engineering, fast to query, structured, indexed, and the failure mode I hit was entirely because I'd left it on defaults built to suit a desktop rather than a server with a small root disk and a service that occasionally loses its mind. Ten lines of drop-in config and a vacuum, and it's been quiet ever since. The logs are still there when I want them, which after that night feels like the whole job.