Ramblings of an aging IT geek
← Ramblings of an aging IT geek
linux

the service i could not convince systemd to stop restarting

A daemon kept respawning after every stop command, and the culprit was a Restart policy fighting against my idea of what stopping meant.

A terminal showing systemctl status output

I wanted to stop a service. Just for a few minutes, to do some maintenance underneath it. systemctl stop foo returned cleanly, I checked, and a second later it was running again with a fresh PID. Stop, running. Stop, running. It felt less like administering a server and more like an argument with something that wasn't listening.

What was actually happening

Two things were conspiring, and neither was a bug.

The first was the unit's own restart policy. It had Restart=always in the [Service] section, which does exactly what it says: if the main process goes away for any reason, bring it back. The thing is, a clean systemctl stop should suppress that, and it does, so this wasn't the whole story.

[Service]
Restart=always
RestartSec=2
ExecStart=/usr/local/bin/foo

The second thing was the real culprit. There was a .path unit watching a directory, and a .timer, both of which had foo.service as their target. Stopping the service didn't stop the things that started it. The moment a file landed in the watched directory, the path unit dutifully fired the service straight back up, exactly as designed. I was stopping the dog and forgetting about the lead it was tied to.

A server rack with a single amber light

The fix

systemctl list-dependencies --reverse foo.service is the command I should have run first. It shows you everything that points at a unit, and there sat the path and timer units I'd forgotten existed. To take the service down properly for maintenance I masked it:

systemctl mask foo.service

Masking links the unit to /dev/null, so nothing can start it: not me, not the path unit, not the timer, not a dependency. It refuses to start until you unmask it. That is the right tool for "I genuinely want this off and I mean it", as opposed to stop, which only means "off until something asks for it again".

The lesson is that in systemd a service is rarely an island. Before you conclude that a unit is haunted, ask what else has an opinion about whether it should be running, because something usually does, and it's usually more stubborn than you are.