Ramblings of an aging IT geek
← Ramblings of an aging IT geek
news

dirty cow, a month on, and why the boring patch is the hard part

Reflections on the Dirty COW privilege-escalation bug a month after disclosure, and why the patching, not the exploit, is the real work.

A newspaper-style tech headline graphic

The Dirty COW bug (CVE-2016-5195) is still the thing people keep bringing up, even though it was disclosed back in October. That's partly because it's a genuinely lovely bug, in the way only a really old one can be: a race in the kernel's copy-on-write handling that's apparently been sitting in there for years, and that lets any local user escalate to root. Local-only, yes, but "local" is a low bar on a shared host or anything where an attacker already has a foothold. And there were reports of it being used in the wild before the fix landed, which is the detail that turns a curiosity into a fire drill.

The exploit got all the attention because it's elegant and it has a logo and a catchy name, and I understand the appeal. But the interesting part for anyone running real machines isn't the race condition. It's everything that comes after the patch ships.

A city skyline at dusk

Because here's the uncomfortable bit. The fix went out quickly, the distros pushed updated kernels within days, and most people I know ran their package manager, saw a new kernel installed, and mentally ticked the box. The trouble is that a kernel update does precisely nothing until you reboot into it. I've now lost count of the boxes I've looked at this month that have the patched kernel sitting on disk and the vulnerable one still running, because nobody wanted to take the reboot. uname -r against the installed package version is a sobering little check to run across a fleet.

That's the real shape of this kind of disclosure. The clever people who find the bug and the clever people who write the exploit get the write-ups and the conference talks. The unglamorous middle, actually getting the fix running on every machine that needs it, is where the security genuinely lives or dies, and it's nobody's favourite job. It means scheduling reboots, it means dealing with the box that won't come back cleanly, it means the one server everyone's forgotten about that's three kernels behind and runs something load-bearing.

Long-lived servers that "can't" be rebooted are exactly the ones that accumulate this debt, and Dirty COW is a pointed reminder of why live-patching tooling, for all its faff, is worth having where you can run it. A vulnerability that needs a reboot to fix is, on a box that never reboots, effectively unpatched no matter what your package manager thinks.

So by all means enjoy the exploit. It's good work and it deserves the attention. But the question worth asking this week isn't "have we patched", it's "have we actually rebooted onto the patch, on all of them, including the one we don't like touching". Mine took until the weekend, and the last one took a deep breath.