The pager went off at twenty past one with "service flapping" and not much else. SSH took an age to let me in, and when it finally did the shell greeted me with -bash: cannot create temp file for here-document: No space left on device. That narrows things down nicely.
df -h told the whole story: /var at 100%. Not the root partition, just /var, which on this box is its own mount. Half the daemons that wanted to write a pidfile or a socket under /var/run had simply given up, and the ones that were still alive couldn't log, so they looked dead to anything watching their logs.
The culprit was a debug log left on after a deploy three weeks earlier. One service had been writing at a few megabytes a second into /var/log with no rotation, because the logrotate stanza had a typo in the path and had been silently doing nothing the entire time. du -sh /var/log/* | sort -h put the offender at the bottom of the list at 38G. A quick truncate -s 0 on the file (not rm, the process still had the handle open) gave me the headroom back instantly, and the services recovered on their own within a minute.
The fix was boring: turn the debug logging off, fix the logrotate path, and add a check that actually alerts on partition usage rather than waiting for things to fall over. The lesson, which I keep relearning, is that "the disk is fine" usually means "the root disk is fine" and /var is off having its own quiet crisis. Worth a df on every mount, not just the one you remember.