I ran systemctl stop on a service and it stopped. Then, a second later, it was running again. Stop, running. Stop, running. For about a minute I genuinely wondered if I'd lost my mind, which is roughly the standard emotional arc of any systemd debugging session.
The unit wasn't haunted. It had Restart=always, and crucially nothing distinguishes "the operator stopped me" from "I crashed" once a stale orchestrator is involved. In my case a separate watchdog timer was noticing the gap and dutifully starting it back up. systemd was doing exactly what it was told. I just hadn't told it the whole truth.
The first thing that cut through the confusion was checking what systemd actually thought had happened, rather than what I assumed:
systemctl status myservice
journalctl -u myservice --since "5 min ago"
The journal showed the service exiting non-zero almost immediately after every start, well before I'd touched it. It wasn't a unit that wouldn't stay dead. It was a unit that wouldn't stay alive, crash-looping fast enough that the restarts blurred into one continuous "running" in my head.
Two things fixed it. First, the actual bug: a missing environment file meant the process bailed on startup, and Restart=always turned a clean, visible crash into an invisible flapping loop. The restart policy wasn't resilience, it was a blindfold. Second, for the debugging itself, systemctl stop followed immediately by masking gives you a moment of peace to think:
systemctl mask myservice
A masked unit is symlinked to /dev/null and physically cannot be started, by you or by an over-eager watchdog. Once I could keep it down, the journal told the real story in about thirty seconds.
The lesson I keep relearning: Restart=always is a fine policy for production and a terrible one while you're diagnosing, because it hides the very evidence you need. When a service "won't die", check whether it's actually dying over and over. Then read the journal before you read your own assumptions.