Ramblings of an aging IT geek
← Ramblings of an aging IT geek
golang

making my peace with context.Context

context.Context felt like noise in every function signature until a hung backend that wouldn't cancel taught me what it's actually for, and where the genuine footguns hide.

A code editor showing Go source

For a good while context.Context struck me as bureaucratic. Every function grew a first parameter that did nothing useful most of the time, you passed it down, passed it down again, and the only thing it seemed to buy you was a slightly longer signature and the smug satisfaction of having done the Go thing. I threaded it through because the linter and every code review told me to, not because I understood what I was paying for.

The understanding arrived, as these things do, via an outage.

The hang that wouldn't let go

We had a service that called a downstream API on behalf of an inbound request. The downstream went unhealthy: not down, which is easy, but slow, hanging on the connection without ever responding or closing. Our service kept its goroutines blocked waiting on those calls. The inbound clients had long since given up and walked away, but our side didn't know that, so it sat there holding connections, holding goroutines, holding memory, until it ran out of all three and fell over.

The fix wasn't more retries or a circuit breaker, though both came later. The fix was that nobody had wired a deadline through to the call that was hanging. The HTTP client had no per-request timeout, the database query had no cancellation, and crucially when the inbound request was abandoned, nothing told the work it had spawned to stop. There was no thread to pull. That thread is exactly what context.Context is.

A diagram-style view of code on screen

What it's actually for

Two things, really, and once they clicked the parameter stopped feeling like noise.

The first is cancellation. When the reason for doing work disappears, the work should stop. An HTTP handler gets a context that is cancelled when the client disconnects. If you thread that same context down into your database call, your outbound HTTP call, your queue publish, then when the client gives up, every bit of in-flight work on their behalf gets the signal to abandon ship. You stop computing answers nobody is waiting for.

The second is deadlines, which are just cancellation with a clock attached. context.WithTimeout gives you a context that cancels itself after a duration, and any well-behaved call that takes a context will honour it.

func (s *Service) Fetch(ctx context.Context, id string) (*Record, error) {
	ctx, cancel := context.WithTimeout(ctx, 2*time.Second)
	defer cancel()

	req, err := http.NewRequestWithContext(ctx, http.MethodGet, s.url(id), nil)
	if err != nil {
		return nil, err
	}
	resp, err := s.client.Do(req)
	if err != nil {
		return nil, fmt.Errorf("fetch %s: %w", id, err)
	}
	defer resp.Body.Close()
	// ...
}

The defer cancel() matters even when the timeout does its job, because it releases the resources the context's timer is holding. Forget it and go vet will, quite rightly, shout at you.

A second view of code showing the goroutine handling

The bits that trip people up

A few things I learned by getting them wrong first.

Pass context, don't store it. The standard library is blunt about this and it's correct: a context belongs to a single request's lifetime, so it lives on the stack and gets passed down, not stashed in a struct field to be reused across calls. A context on a struct is a context that will eventually be cancelled at the wrong time for the wrong request.

context.Value is for request-scoped data that crosses API boundaries, things like a request ID or a trace span, and not for passing your function its actual arguments. Every time I've reached for ctx.Value to avoid adding a parameter, I've regretted it, because the dependency became invisible and untyped and the compiler stopped helping me. If a function needs a thing to do its job, that thing should be a parameter.

Honouring a context is your job, not magic. A context being cancelled doesn't stop your code; it just makes ctx.Done() fire and ctx.Err() return non-nil. If you have a long loop or a hand-rolled wait, you have to actually check:

for _, item := range items {
	select {
	case <-ctx.Done():
		return ctx.Err()
	default:
	}
	process(item)
}

Most library calls you hand a context to will do this checking for you, which is the whole point of the convention. But the tight loop you wrote yourself won't, unless you make it.

And don't paper over a missing deadline with context.TODO() everywhere. It compiles and it's tempting, but TODO is a note to yourself that you haven't decided where the real context comes from, not a destination. The hang that taught me all this was, at its root, a chain of calls where the context had quietly become context.Background() partway down and stopped carrying any deadline at all.

Where I've landed

I no longer see the ctx context.Context first parameter as ceremony. It's the cancellation and deadline channel for a unit of work, and threading it from the edge of the program all the way down to the call that finally blocks on the network is what lets the whole thing let go when it should. The day a slow downstream can't take your service down with it, because every call between the front door and the socket is watching the same context, is the day the parameter pays for itself many times over.

It still makes the signatures longer. I've decided that's a fair price.