Ramblings of an aging IT geek
← Ramblings of an aging IT geek
linux

the log that kept growing after rotation

A disk filled up because logrotate rotated the file but the app held the old inode open, ignoring the SIGHUP that was meant to make it reopen.

A Linux terminal with log output

Disk filled up on a box that was supposedly rotating its logs nightly. The rotated files were all there, neatly dated and gzipped, so logrotate was clearly running. And yet df kept climbing. The trick, which I'd seen before and still walked into, is that logrotate had renamed the file but the application was still writing to the old one.

On Linux a file is its inode, not its name. When logrotate moves app.log to app.log.1, a process that already has the file open keeps writing to the same inode under its new name. The space the "deleted" file occupies isn't freed until the last open handle closes, which is why df says full whilst du on the directory looks fine. lsof | grep deleted is the tell.

The postrotate script was supposed to fix exactly this by sending the app a SIGHUP to make it reopen its log file. The app, it turned out, didn't handle SIGHUP at all. The signal arrived and was ignored, so it carried on writing into the now-renamed, soon-to-be-gzipped file, and eventually into a gzipped one, which is as broken as it sounds.

Two ways out, depending on the app. If it can be made to reopen on a signal, wire that into postrotate and confirm it actually does something rather than assuming. If it can't, use copytruncate, which copies the file aside and truncates the original in place so the inode stays the same and the handle stays valid. It costs you a small window where lines can be lost between copy and truncate, but for an app that won't cooperate, a tiny gap beats a full disk. I went with copytruncate and the graph finally came back down.