Ramblings of an aging IT geek
← Ramblings of an aging IT geek
networking

the resolver was fine, the search domain was not

A homelab DNS outage that wasn't a DNS outage at all, traced to a stray search domain I'd left in dhcpd handing out a wildcard answer to everything.

A bundle of network cables in a rack

For about an hour on Sunday, half the names on my network resolved to the same wrong address. Not failed to resolve, which would have been easier to spot. Resolved, confidently, to a box that had nothing to do with anything I'd asked for.

The resolver was healthy. dig @localhost github.com returned the right answer every time. But ping fileserver from a laptop ended up at my reverse proxy, and ping printer did too. Everything internal collapsed onto one IP.

The culprit was a search domain. I'd added one to my dhcpd config months ago so I could type short hostnames, then later set up a wildcard record on my internal zone for a project, *.home.lan pointing at the proxy. The two combined beautifully. The laptop would try fileserver, fail, then helpfully append the search domain and try fileserver.home.lan, which the wildcard happily answered with the proxy's address. Short name, wrong box, no error anywhere.

A rack of networking equipment

What made it slow to find was that nothing was broken in the way I expected. There was no SERVFAIL, no timeout, no log line screaming about a dead upstream. Every layer was doing exactly what I'd told it to. The resolver answered, the wildcard matched, the client followed its search list. The bug was the sum of three correct decisions made months apart.

The fix was to scope the wildcard properly: put the project on its own subdomain, *.proj.home.lan, so it stops catching every short name I type. I also added explicit A records for the handful of internal hosts I actually care about, so they win before any wildcard gets a look in.

The lesson, again, is that the most confusing outages aren't the ones where something fails. They're the ones where everything succeeds and the result is still wrong. A wildcard DNS record is a loaded gun pointed at every unqualified lookup on your network, and I'd left the safety off.