Three days. A service that fell over roughly once every few thousand requests with a nil dereference, and never, ever when I was watching it. The classic shape of a race: every diagnostic I added changed the timing enough to make it disappear. Add a log line, the bug sulks off for an hour. Attach a debugger, it goes on holiday entirely.
The thing that finally cracked it was giving up on reproducing live and reading the code as if I trusted nobody. There was a cache that initialised itself lazily on first use, guarded by a check-then-set with no lock around the pair. Under load, two requests would both see the cache empty, both start building it, and one would hand a half-built map to the other. Single-threaded, impossible. Under concurrency, just rare enough to look like haunting.
The fix was a sync.Once so the initialisation happened exactly once no matter how many goroutines arrived together. Five lines. Three days.
What I take from it: stop trying to catch an intermittent bug in the act when adding observers changes the timing. Find the shared state, find the window where two things touch it without a lock, and the reproduction stops mattering. I knew that already. I'll know it harder now.