Ramblings of an aging IT geek
← Ramblings of an aging IT geek
golang

threading context through, and finally getting it

A walk through why context.Context exists in Go, how cancellation actually propagates, and the mistakes I made before it clicked.

A code editor showing Go source

For a long time context.Context annoyed me. It's the first argument to half the functions in a Go codebase, it's ctx everywhere, and when you're new it reads like ceremony: a parameter you pass because the linter sulks if you don't. Then I had a service that wouldn't shut down cleanly, where a SIGTERM would leave goroutines mid-flight draining a queue that no longer mattered, and the whole thing finally clicked. Context isn't ceremony. It's the wiring that lets you say "stop, we're done here" and have everyone downstream actually hear it.

the problem it solves

Imagine a single inbound HTTP request. It hits your handler, which calls a service, which calls a database and an upstream API in parallel, each of which might fan out further. That's a tree of goroutines, all doing work on behalf of one request. Now the client hangs up. Or the request takes too long and you want to give up at two seconds. How do you tell every goroutine in that tree to stop?

Without context you can't, not cleanly. You'd be passing your own chan struct{} down every call and selecting on it everywhere, reinventing the same pattern badly in each package. Context is that pattern, standardised: a value that carries a cancellation signal down through a call tree, plus an optional deadline, plus a place to stash a few request-scoped values.

The shape is a tree. You start from context.Background() at the top of your program, and every operation derives a child:

ctx, cancel := context.WithTimeout(parent, 2*time.Second)
defer cancel()

Cancel a parent and every child cancels with it. That's the whole mental model, and once I had it the rest fell into place.

Programming on screen with a parallel structure

how cancellation actually propagates

The signal travels through ctx.Done(), a channel that closes when the context is cancelled. You don't read a value off it; the close is the event. So the idiom in any blocking loop is a select:

func worker(ctx context.Context, jobs <-chan Job) error {
	for {
		select {
		case <-ctx.Done():
			return ctx.Err() // context.Canceled or context.DeadlineExceeded
		case j, ok := <-jobs:
			if !ok {
				return nil
			}
			if err := process(ctx, j); err != nil {
				return err
			}
		}
	}
}

The thing that took me embarrassingly long to internalise: context doesn't stop anything by itself. It can't reach into your goroutine and pause it. All it does is close a channel. If your code never checks ctx.Done(), cancellation does precisely nothing. This is why a time.Sleep in the middle of a worker is a bug waiting to happen: it ignores the context. Use a select with a timer instead, so the sleep is interruptible.

The good news is that all the well-behaved standard library and ecosystem code already checks. http.Client requests respect the context's deadline. database/sql has QueryContext, ExecContext and friends that abandon the query when the context cancels. So if you thread the context all the way down, cancellation Just Works at the leaves, because the leaves are network and IO calls that already understand it.

the mistakes I made first

I stored a context in a struct field. The docs tell you not to, in fairly blunt terms, and I now understand why: a context is scoped to a single operation, not to the lifetime of an object. Put it in a field and you've frozen one request's deadline onto something long-lived. It's the wrong axis. Context goes through the call chain as a parameter, always first, never hidden in state.

I overused context.WithValue. It's tempting to treat it as a request-scoped bag and shove anything in: the logger, the database handle, config. Then your function signatures lie. They say they take a context, but really they need three values smuggled inside it, and nothing tells the caller that. I now keep values to things that are genuinely cross-cutting and incidental: a request ID for tracing, an auth principal. The actual dependencies a function needs go in as proper arguments where the compiler can see them.

I also passed nil as a context in a few places early on, which compiles and then panics the moment something calls ctx.Done() on it. If you don't have a real one, context.TODO() is the honest placeholder. It behaves like Background() but signals "I haven't decided how this gets its context yet", which is useful when you're partway through wiring it through a codebase.

Code on a monitor with terminal output

the shutdown that started all this

Back to the service that wouldn't die gracefully. The fix was to make a single root context that gets cancelled on a signal, and thread it everywhere:

ctx, stop := signal.NotifyContext(context.Background(),
	syscall.SIGINT, syscall.SIGTERM)
defer stop()

srv := &http.Server{Addr: ":8080", Handler: mux}

go func() {
	<-ctx.Done()
	shutdownCtx, cancel := context.WithTimeout(
		context.Background(), 10*time.Second)
	defer cancel()
	srv.Shutdown(shutdownCtx)
}()

signal.NotifyContext landed in 1.16's standard library cycle and is exactly the convenience you want here, but the pattern works the same with a manual signal.Notify and a cancel() if you're on an older release. The point is that one SIGTERM cancels the root, the cancellation flows down through every handler and worker that bothered to select on Done(), in-flight work gets a bounded window to finish, and new work is refused. The queue drainers I mentioned earlier stopped pulling the moment the context cancelled, instead of grinding through a backlog nobody was waiting for any more.

That's the whole thing, really. Context is a cancellation tree with a deadline bolted on. Pass it first, check it in your loops, derive children for sub-operations, and don't smuggle your dependencies through WithValue. Do that and the next time someone hits Ctrl-C, everything downstream hears it and tidies up. It stopped feeling like ceremony the day I watched it actually work.