a hundred lines of go that finally left my laptop

Code on a screen

I have written dozens of little tools that never left my laptop. This one I actually shipped: a small daemon that polls a handful of internal endpoints, checks they return what they should, and exposes the results as Prometheus metrics so Grafana can draw it. About a hundred and fifty lines of Go, and it is now running on the box it watches. The shipping is the part I usually skip, so this post is mostly about the boring bits that make something deployable rather than the polling, which is trivial.

why go for this

Because the output is a single static binary with no runtime to install. I scp one file to the target, drop a systemd unit next to it, and it runs. No virtualenv, no interpreter version to match, no shared library surprises. For a long-lived background process on a server I do not log into often, that property is worth more than any language feature.

The whole thing is the standard library plus the Prometheus client. No web framework, because net/http is already a perfectly good web server.

the shape of it

The core is a loop on a ticker, with a context so it can be told to stop:

func (c *Checker) Run(ctx context.Context) {
	ticker := time.NewTicker(c.interval)
	defer ticker.Stop()

	for {
		select {
		case <-ctx.Done():
			return
		case <-ticker.C:
			c.checkAll()
		}
	}
}

That select on ctx.Done() is the bit that took me embarrassingly long to make a habit. Without it, the process ignores SIGTERM and systemd ends up killing it with SIGKILL after a timeout. With it, the daemon notices the signal, the loop returns, in-flight checks finish, and it exits cleanly. The difference between a tidy shutdown and a abrupt one is one channel.

A diagram of polling and metrics flow

wiring up the signals

The main function is mostly plumbing, and the plumbing is what makes it a daemon rather than a script:

func main() {
	ctx, stop := signal.NotifyContext(context.Background(),
		syscall.SIGINT, syscall.SIGTERM)
	defer stop()

	checker := NewChecker(loadConfig())
	go checker.Run(ctx)

	srv := &http.Server{Addr: ":9100"}
	http.Handle("/metrics", promhttp.Handler())

	go func() {
		if err := srv.ListenAndServe(); err != nil &&
			err != http.ErrServerClosed {
			log.Fatal(err)
		}
	}()

	<-ctx.Done()
	shutdownCtx, cancel := context.WithTimeout(
		context.Background(), 5*time.Second)
	defer cancel()
	srv.Shutdown(shutdownCtx)
}

signal.NotifyContext only landed in Go 1.16 a couple of months ago, and it tidies this up nicely: the context cancels itself when a signal arrives, so you do not need the old dance of making your own channel and signal.Notify into it. Small thing, but it is exactly the kind of paper cut the standard library quietly removes every release.

the metrics

Each check updates a gauge labelled by name, so a dashboard can show all of them at once:

var up = prometheus.NewGaugeVec(
	prometheus.GaugeOpts{
		Name: "endpoint_up",
		Help: "1 if the endpoint responded as expected",
	},
	[]string{"name"},
)

Set it to 1 on success and 0 on failure, register it once at startup, and Prometheus scrapes /metrics on its own schedule. The daemon does not push anything anywhere, which means there is no queue to back up and nothing to fall over if Prometheus is briefly down. It just keeps the numbers current and waits to be asked.

shipping it

The deploy is genuinely three steps. Cross-compile, copy, enable:

GOOS=linux GOARCH=amd64 go build -o checkd .
scp checkd box:/usr/local/bin/
ssh box 'systemctl enable --now checkd'

The systemd unit is a dozen lines, with Restart=on-failure so it comes back if it dies and Type=simple because the binary does not fork. That is the entire operational footprint.

What I take from this is not anything clever about Go. It is that "ship it" was always one systemd unit and a graceful shutdown away, and I had been treating that gap as larger than it is. The polling logic was twenty minutes. Making it behave like a citizen of the machine, respecting signals, exposing its state, restarting cleanly, was the rest, and it is the part that turns a script into something I trust to run unattended.