There is a particular kind of outage that does not feel like an outage. Nothing crashes. No pager goes off at 3am. Things just quietly stop working, and you find out because someone in a Slack channel asks why the numbers went flat at midnight. That happened to a lot of people this week, courtesy of a deprecation that had been on a calendar for the best part of a year and that almost nobody had actually actioned.
I will not name and shame, partly because the specifics will be different by the time you read this and partly because it does not matter. A provider had announced, well in advance, that an older API version was being switched off at end of April. They sent the emails. They put up the banners. They added the deprecation warnings to the response headers. And then, on the appointed day, they did exactly what they said they would do, and a great many services that had been quietly ignoring all of that simply fell over.
We are all bad at this
The reaction online has been the predictable two camps. One half is furious at the provider for "breaking" things. The other half is smugly pointing out that a year's notice is a year's notice, and that you reap what you sow. Both of them are missing the actual problem, which is that deprecation warnings are designed for humans and the things calling deprecated APIs are very rarely watched by humans.
A Sunset header is a lovely idea. It is also completely invisible to a cron job written by someone who left the company in 2023. The warning email goes to an inbox nobody monitors. The banner appears in a dashboard nobody opens because the integration "just works". The whole mechanism assumes an attentive operator, and the entire point of a working integration is that there is no attentive operator. That is the contradiction at the heart of every deprecation timeline, and we keep pretending it does not exist.
What actually helps
I have been on both sides of this, and the only thing I have seen genuinely work is making the warning impossible to ignore well before the cutoff. Brownouts, where the old version is deliberately taken down for an hour during business hours a few weeks ahead, are crude and they generate angry tickets, but they generate them from people who are awake and can do something about it. That is infinitely better than a clean break at midnight when the only thing that notices is your error budget.
The other half of the job is on us, the consumers. If you depend on a third-party API, you should be alerting on its deprecation headers the same way you alert on a rising error rate. It is not hard. The header is right there in every response. We just do not bother, because parsing someone else's polite warning into our own monitoring feels like work for a problem that is not yet on fire.
So here we are again. A perfectly well-managed, well-communicated, long-scheduled change took out a load of services, and the only surprising thing is that anyone was surprised. I spent part of this morning grepping our own codebase for calls to anything with a "v1" in the path and a date in its future. You probably should too, while the lesson is fresh and slightly embarrassing rather than expensive.