structured logging in go, and learning to stop fighting it

Source code on a screen

For years my Go logging was a thin wrapper over log.Printf with a severity prefix bolted on. It worked, in the sense that a car with three wheels works. The day it stopped being acceptable was the day I had to debug a production incident by grepping for a request through eleven log lines that all said roughly the same thing in slightly different phrasing, because four different people had written them over two years. None of them shared a field name. One said userID, one said user_id, one said uid, and one helpfully embedded it in the middle of a sentence. I could not write a single query that found all the lines for one user. That's when structured logging stopped being a nice-to-have and became the actual job.

the printf problem

The trouble with formatted log strings is that they're write-only. You produce a nice human sentence at the call site and then, forever after, anything that wants to read it machine-side has to reverse-engineer your sentence with a regular expression. It's information destruction. You had the user ID as a typed value, you mashed it into a string, and now a parser has to dig it back out and hope you didn't change the wording.

Structured logging inverts that. Instead of a sentence, you emit a message plus a set of key/value fields:

logger.Info("request completed",
    "user_id", userID,
    "method", r.Method,
    "path", r.URL.Path,
    "status", status,
    "duration_ms", elapsed.Milliseconds(),
)

That line is still readable by a human if you render it in a console encoder, and it's directly queryable by a machine because every value is a real field. status:500 AND duration_ms:>1000 is a query you can actually run. The sentence version of that question is a grep you have to keep refining until you give up.

picking a library and not agonising over it

In 2020 the realistic choices are zap, zerolog, or logrus. I won't relitigate the benchmarks here. Logrus is the gentle on-ramp, ergonomic and reflection-heavy. Zap and zerolog are the fast ones, designed to allocate almost nothing on the hot path, which matters if you're logging on every request. I landed on zap because the structured API reads cleanly and the performance is genuinely good, but the honest truth is the library choice matters far less than the conventions you build on top of it. A team with strict field-name discipline on logrus beats a team with no discipline on zap every time.

A profiler and logging output on a terminal

the bit that actually made it stick

Switching the logging call wasn't the hard part. The hard part was getting every log line about a given request to carry the same identifying fields without every function in the call stack having to know about them. The answer was to put a logger in the request context, pre-loaded with the fields that identify the request, near the edge:

func WithLogger(ctx context.Context, l *zap.Logger) context.Context {
    return context.WithValue(ctx, loggerKey{}, l)
}

func From(ctx context.Context) *zap.Logger {
    if l, ok := ctx.Value(loggerKey{}).(*zap.Logger); ok {
        return l
    }
    return zap.L() // sane fallback, never nil
}

A middleware enriches the logger once per request with the request ID and the user, then stows it in the context:

func LoggingMiddleware(base *zap.Logger) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            reqID := middleware.GetReqID(r.Context())
            l := base.With(
                zap.String("request_id", reqID),
                zap.String("remote", r.RemoteAddr),
            )
            ctx := WithLogger(r.Context(), l)
            next.ServeHTTP(w, r.WithContext(ctx))
        })
    }
}

From there, any handler or service that's been passed the context gets a logger that already knows which request it's in:

func (s *Service) DoThing(ctx context.Context, id string) error {
    log := From(ctx)
    log.Info("doing thing", zap.String("thing_id", id))
    // ...
}

Now every line emitted during that request shares a request_id, and the incident I opened with becomes a single query: filter by request ID, read the story top to bottom. No grep, no guessing, no archaeology.

the conventions that mattered more than the library

A few rules that did most of the work:

One field name per concept, written down. user_id, never userID or uid. Boring, snake_case, agreed once, enforced in review.
Log the message as a stable constant, not an interpolated string. The message is for humans and for grouping; the variable bits go in fields. log.Info("payment failed", zap.String("reason", reason)), not log.Info(fmt.Sprintf("payment failed: %s", reason)).
Never log secrets, obviously, but also never log whole structs blindly. It's astonishing how often a password or token rides along inside a request object somebody decided to dump in full.
Pick your levels and mean them. Info is the normal story of the system. Warn is something a human might want to know but isn't on fire. Error is something broke and someone may need to act. If everything is an error, nothing is.

was it worth it

Yes, plainly. The migration was tedious, a few weeks of touching old code and resisting the urge to "improve" it while I was in there. But the first time a real incident turned from an hour of grep into a thirty-second query, it paid for itself. Structured logging isn't clever and it isn't new. It's just the difference between writing logs you can read and logs you can actually use, and once a system gets big enough, that's the difference between debugging and praying.