For a long time context.Context felt like ceremony I was performing to make the linter happy. You add ctx context.Context as the first parameter, you pass it down, the function below takes it and passes it down again, and eventually it reaches a database call that actually uses it. In between, a dozen functions just carry the thing without touching it. It looked like overhead. It is not overhead, and the day that clicked was the day I stopped fighting it.
The point of threading it through is that the leaf calls, the HTTP request, the SQL query, the gRPC stub, are the ones that can be cancelled, and they can only be cancelled if the cancellation signal reaches them. The intermediate functions carry the context precisely because they don't know which call three levels down is the slow one. You pass it everywhere so that a deadline or a cancel set at the top, a client disconnecting, a request timing out, propagates all the way to the syscall that's blocking. Break the chain anywhere and everything below that point becomes uncancellable. It will run to completion long after anyone cares about the answer.
Two mistakes I made repeatedly. The first was reaching for context.Background() in the middle of a call stack because I had a context-shaped hole and the incoming ctx wasn't conveniently in scope. That quietly severs the chain: the new background context has no deadline and no link to the caller, so cancellation stops dead right there. If you genuinely need to outlive the request, say a fire-and-forget cleanup, that's a real decision with context.WithoutCancel, not something you reach for to dodge a compile error.
The second was stuffing values into the context that the code below actually depended on to function. context.WithValue is for request-scoped metadata that's incidental, a trace ID, a request ID, the things you'd be sad but not broken to lose. The moment a handler can't work without something I'd shoved in there, I'd built an untyped, undocumented dependency-injection mechanism that the compiler couldn't help me with. Dependencies go in structs and function parameters where they're visible. The context carries the cross-cutting stuff and the cancellation, nothing load-bearing.
Once both of those stopped, the pattern reads as exactly what it is. The first parameter is the lifetime of the work. You hand it down so that whoever's doing the slow part can be told to stop. The functions in the middle aren't carrying dead weight, they're keeping the line open. It's one of those Go conventions that looks like bureaucracy right up until the first time a cancelled request cleanly tears down the forty goroutines it spawned, and then you stop questioning it.