Ramblings of an aging IT geek
← Ramblings of an aging IT geek
golang

the smallest daemon i could get away with

Writing a small single-binary Go daemon that polls a few endpoints and exposes health, signals and a clean shutdown, without dragging in a framework.

A terminal showing Go source code

I needed a daemon. Not a service mesh, not a controller, not anything that wants a Helm chart. Just a long-running process that wakes up every thirty seconds, checks a handful of internal endpoints, and writes the result somewhere I can scrape it. The sort of thing that used to be a cron job and a shell script, until the shell script grew three flags and a state file and started lying to me about whether it was still running.

So I wrote it in Go, and the whole point of this post is how little I had to write. One binary, no dependencies beyond the standard library, and it does the job.

The shape of it

The bones of a daemon like this are always the same: a context that gets cancelled on signal, a ticker, and a worker that respects the context. Everything else is detail.

func main() {
	ctx, stop := signal.NotifyContext(context.Background(),
		syscall.SIGINT, syscall.SIGTERM)
	defer stop()

	t := time.NewTicker(30 * time.Second)
	defer t.Stop()

	if err := checkOnce(ctx); err != nil {
		log.Printf("initial check failed: %v", err)
	}

	for {
		select {
		case <-ctx.Done():
			log.Println("shutting down")
			return
		case <-t.C:
			if err := checkOnce(ctx); err != nil {
				log.Printf("check failed: %v", err)
			}
		}
	}
}

signal.NotifyContext is the bit that made this pleasant. Before it existed you wired up a channel, caught the signal, cancelled a context yourself, and inevitably got the ordering subtly wrong on the second signal. Now Ctrl-C does the obvious thing, and a second Ctrl-C restores the default behaviour and kills it outright, which is exactly what you want when the first one didn't take.

A simple block diagram of the daemon loop

Health, because systemd asks

I run this under systemd, so it gets a tiny HTTP server on a loopback port with /healthz returning the timestamp of the last successful check. Nothing clever. A handler, a mutex around the last-good time, and http.Server with a sane ReadHeaderTimeout so it isn't a slowloris target even on localhost.

srv := &http.Server{
	Addr:              "127.0.0.1:9101",
	ReadHeaderTimeout: 5 * time.Second,
}

The thing I keep relearning: give every outbound request a timeout via the context, and give the server its timeouts explicitly. The zero values are not your friends. A daemon that hangs forever on a single dead endpoint is worse than no daemon, because at least no daemon tells you the truth.

The build is CGO_ENABLED=0 go build, the result is about six megabytes, and it copies to the box with scp. No runtime to install, no virtualenv to rot. It has been up for a week now and I have thought about it precisely zero times since, which is the highest praise I have for any piece of software I wrote myself.