Ramblings of an aging IT geek
← Ramblings of an aging IT geek
news

the bug class that never quite goes away

A look at the December 2019 round of disclosures and why the same handful of root causes keep producing the same headlines, with notes on what actually moves the needle in practice.

A newspaper-style collage of security headlines on a screen

It is December, which by long tradition means a fresh disclosure doing the rounds and everyone in the office Slack pasting the same advisory link with a slightly different one-line take. This month it is a memory-corruption issue in a widely deployed bit of plumbing, the kind that lives below the application and so quietly that most of the people now patching it would have struggled to name it a week ago. The exact CVE does not matter much for what I want to say, and honestly by the time you read this there will be another one. What strikes me, again, is how few distinct shapes these things actually come in.

I have been doing this long enough to have a sort of mental bingo card. Trust boundary crossed without revalidation. An integer that overflowed where someone assumed it could not. A length read from attacker-controlled input and believed. A default that was convenient in 2009 and dangerous by 2014 and still shipping. A parser, almost always a parser. Every quarter the logo on the advisory changes and the underlying mistake is one of about six old friends.

why the same bugs keep winning

The uncomfortable answer is that the conditions that produce these bugs have not changed, so the bugs have not changed either.

Memory-unsafe languages still run an enormous fraction of the internet's load-bearing code. That is not a moral failing, it is forty years of accumulated infrastructure that works, and you do not rewrite OpenSSL or the kernel's networking stack on a Tuesday because a blog post annoyed you. But it does mean a whole category of mistake remains reachable. You can be a careful, experienced C programmer and still, on a tired afternoon, get a bounds check subtly wrong, and the language will not stop you. The tooling has improved a lot. Sanitisers, fuzzing, the way projects like oss-fuzz quietly grind through inputs all day, these have genuinely changed the landscape. They find these bugs faster now. They do not stop them being written.

The second thing is that the interesting failures live at the seams. Not inside one well-reviewed function, but where two systems meet and each assumed the other was checking. The proxy trusts the backend, the backend trusts the proxy, and the request that lies to both walks straight through. I have lost more hours of my life to "but I validated it here" than to any single clever exploit. The clever exploits make the conference talks. The seam bugs make the outages.

Two server racks meeting at a patch panel, cables crossing between them

the gap between disclosure and reality

Here is the part the headlines flatten. A disclosure is an event. Patching is a process, and the process is where it all goes wrong.

The advisory lands. The mature shops with an actual inventory of what they run know within an hour whether they are exposed. Everyone else spends the first day finding out what they even have. This is the bit nobody puts in the threat model: you cannot patch a thing you have forgotten you are running. The vulnerable library is four levels deep in a container image built by a team that left, pinned to a version no one remembers choosing, in a service that "we were going to decommission". I have been that team. I have left that image.

Then there is the long tail. The patch exists, it is good, it is free, and three years from now the same version will still be sitting on the public internet on hardware somebody forgot. The disclosure is not what creates the risk window. The disclosure mostly closes it for the people paying attention, and does nothing at all for the people who were always going to be the eventual victims. That asymmetry is the whole game.

A long line of identical server cabinets fading into the distance

what actually helps

I am wary of turning this into a listicle of security hygiene, because you have read that post and so have I and neither of us changed our behaviour as a result. So let me keep it to the things that have actually saved me, personally, in anger.

Knowing what you run, mechanically and continuously, beats almost everything else. Not a spreadsheet someone updates when they remember. A generated inventory, software bill of materials if you want the grown-up term, that tells you on the morning of a disclosure whether the affected component is anywhere in your estate. The shops that handle these weeks calmly are not the ones with cleverer engineers. They are the ones who can answer "are we affected" in minutes instead of days.

Reducing the blast radius helps more than chasing zero bugs, because you will never get to zero bugs. Network segmentation that means a compromised front-end cannot immediately reach the database. Credentials scoped so tightly that stealing one buys you very little. The assumption, baked in everywhere, that any given component might be the one that falls over this quarter. You are not trying to be unbreakable. You are trying to make the first break boring.

And, genuinely, boring upgrade discipline. The unglamorous habit of staying close to current, so that when the emergency patch arrives you are applying a small delta and not attempting a major version migration under fire. Most of the truly miserable disclosure weeks I have lived through were miserable not because the bug was sophisticated but because we were nine months behind and the fix would not apply cleanly.

So I will paste the link in Slack like everyone else, and I will patch the affected boxes, and I will feel briefly virtuous. But the actual work was done in the quiet months before: the inventory, the segmentation, the unsexy upgrades. The disclosure is just the exam. You revise long before you sit it, or you don't, and December tells you which.