The disk filled up on a box that, according to du, was using barely half of it. That contradiction is always the same story: a process is holding a file descriptor open to a log file that has already been deleted. The space won't come back until the process lets go.
lsof | grep deleted confirmed it. A vendor service was writing happily to /var/log/thing/app.log (deleted), several gigabytes of it, all invisible to du because the directory entry was gone. logrotate had done its job, renamed the old log, and signalled the service to reopen. The service had simply ignored it.
The usual logrotate dance relies on a postrotate hook telling the app to reopen its files, most often with SIGHUP:
/var/log/thing/*.log {
daily
rotate 14
compress
postrotate
/bin/kill -HUP $(cat /var/run/thing.pid)
endscript
}
This works beautifully right up until the app doesn't handle SIGHUP. Plenty don't. For some daemons SIGHUP means "reload config", for some it means "reopen logs", and for a depressing number it means the default action, which is to terminate. This one fell into the worst category: it caught SIGHUP and did nothing useful with it. The signal arrived, the handler shrugged, and the descriptor stayed pointed at the deleted inode.
You can't make an app respect a signal it has decided to ignore. So the answer is to stop asking it to. logrotate has copytruncate for exactly this case:
/var/log/thing/*.log {
daily
rotate 14
compress
copytruncate
}
copytruncate copies the live log to the rotated filename and then truncates the original in place to zero bytes. The app never has to reopen anything, because the file descriptor it holds is still valid, still pointing at the same inode, which is now empty. No signal, no cooperation required.
It is not free, and it is worth being honest about the trade. There is a small window between the copy and the truncate where the app may write log lines that land in neither file. Under heavy logging you can lose a handful of lines per rotation. For an application log that nobody audits to the line, that is a fine price. For anything where every line matters, billing, security audit, you want the app to behave properly and reopen on signal, and if it won't, that's a bug to chase upstream rather than paper over.
For this vendor box, a few lost lines once a day was nothing against a disk that no longer filled silently. I restarted the service once to release the orphaned descriptor and reclaim the space, switched the config to copytruncate, and moved on. The real lesson is older than logrotate: a deleted file with an open writer is not gone, it's just hidden, and lsof will always tell you who's still holding the door.