Ramblings of an aging IT geek
← Ramblings of an aging IT geek
news

a wormable bug in windows dns, and why the word "wormable" empties a room

Reacting to the SIGRed Windows DNS Server flaw disclosed this week, what makes a seventeen-year-old bug suddenly urgent, and how we triaged it without panicking the estate.

A city skyline at dusk, the sort of infrastructure you stop noticing until it breaks

This week's Patch Tuesday landed with one item that pushed everything else off my feeds: a critical flaw in the Windows DNS Server role, CVE-2020-1350, which Check Point's researchers have named SIGRed. Microsoft scored it 10.0 and used the word that empties a room, "wormable," meaning a single compromised server could spread the exploit to others with no human in the loop. The bug has apparently been sitting in the code for around seventeen years. Nobody is claiming it has been exploited in the wild yet, but a maximum-severity, wormable, domain-controller-adjacent bug is the kind of thing you patch this week rather than next.

I want to be careful here, because the temptation with a 10.0 is to either panic or shrug, and both are wrong. So let me write down how we actually triaged it, because the process matters more than the headline.

what actually makes this one bad

Severity scores are blunt. What makes SIGRed genuinely worth dropping other work for is the combination of three things, not the number.

The first is where it runs. The Windows DNS Server role is very often co-located with Active Directory, on domain controllers, which are the crown jewels. A bug that gives you code execution on a DNS server frequently gives you code execution on a domain controller, and from there the whole forest is a negotiation.

The second is reachability. The exploit can be triggered through DNS responses, which means a server that makes outbound queries to resolve external names can be attacked by controlling the answer it gets back. You do not necessarily need to be inside the network first. That is what lifts it from "bad if someone is already on your DC" to "bad full stop."

The third is the wormability. One compromised DNS server can be used to compromise the next. That is the property that turns an incident into an outbreak, and it is why the language around this one is sharper than the usual Tuesday fare.

The same infrastructure from another angle, still quietly load-bearing

how we triaged it without losing the morning

Step one was inventory, because you cannot patch what you cannot see. The vulnerable component is the DNS Server role, not the DNS client every Windows box runs. A workstation resolving names is not affected. A server with the DNS Server role installed is. So the first job was an honest list of which machines actually run the role, which is a smaller and more important set than "all our Windows servers," and it is worth getting right rather than guessing.

Step two was the patch itself. Microsoft shipped a fix in this week's update, and that is the real answer: apply it. For us that meant the domain controllers and the handful of internal DNS servers, in a maintenance window, tested on one before the rest. Boring, correct, done.

Step three, and the reason I am glad I read past the headline, is that Microsoft also published a registry workaround for anyone who genuinely cannot patch immediately. It caps the maximum length of a DNS message over TCP, which closes the specific path the exploit needs, at the cost of a documented edge case around very large responses.

HKLM\SYSTEM\CurrentControlSet\Services\DNS\Parameters
  TcpReceivePacketSize = 0xFF00 (DWORD)

A workaround is not a patch, and you should still patch. But having a mitigation that buys time changes the shape of the conversation with anyone who controls a change window and is nervous about touching a domain controller on short notice. It turns "patch now or be exposed" into "mitigate now, patch in the window," which is a much easier sentence to get agreed at short notice.

The thing I appreciate about this particular workaround is that it is specific and explicable. It is not "turn off the feature" or "block a port and pray." It caps the TCP DNS message size to a value that is below what the exploit needs but above what any normal response requires, which means it closes the attack path without breaking ordinary resolution. You can explain it to a change board in one sentence and they can understand the trade. Microsoft also notes the edge case where unusually large responses could be affected, which is exactly the sort of honesty you want, because a mitigation that quietly breaks something rare is worse than no mitigation at all. We applied it on the servers that could not take the patch immediately, scheduled the patch for the next window, and then, crucially, went back and removed the registry value once the patch was in. A temporary mitigation that becomes permanent because nobody cleaned it up is a future surprise you are setting for yourself.

A wider view of the skyline at night, lit up and dependent on systems nobody is watching

the part that stays with me

Seventeen years. The bug has been in that code since long before most of the people now patching it had the job. That is not a knock on Microsoft specifically. It is the nature of large, old, load-bearing code: the depth of the bug is no guide to how long it has been there, and "we have run this for years without incident" is not evidence of safety, only of nobody having looked in the right place with the right intent. SIGRed is a reminder that the boring infrastructure, the DNS server you provisioned once and never think about, is exactly where the worst surprises hide, precisely because nobody thinks about it.

There is a second-order lesson too, about how disclosure actually works. SIGRed was found by researchers who went looking, reported it responsibly, and gave Microsoft time to ship a fix before the details went public. By the time it was all over my feeds, the patch already existed. That is the system working roughly as intended, and it is worth saying so on a week when the easy reaction is to be cynical about how broken everything is. The bug is bad. The handling of it was good. Both things are true, and the second is why we got to respond in a planned window rather than during an active outbreak.

We patched. The estate is fine. But I have spent a chunk of this week looking at our own quiet, load-bearing, long-untouched services and asking which one is sitting on its own seventeen-year-old surprise. The honest answer is that I do not know, and that is the uncomfortable lesson every disclosure like this hands you for free.