The Afternoon A Full /var Took The Service Down
#debugging#ops
The Bug That Refused To Exist Under strace
#debugging#ops
The Bug That Only Existed When I Was Not Looking
#debugging#ops
The Day It Was, In Fact, DNS
#debugging#ops
A Memory Leak That Was a Map I Never Cleared
#debugging#ops
Three Days Lost to a Race I Couldn't Reproduce
#debugging#ops
The Off-by-One That Three of Us Approved
#debugging#ops
The Bug Was in My Head, Not the Function
#debugging#ops
It Was the MTU. It's Always the MTU.
#debugging#ops
tcpdump Saved Me Again
#debugging#ops
The Off-By-One That Three of Us Approved
#debugging#ops
When the Code Was Right and I Was the Bug
#debugging#ops
The Bug Was in My Assumptions, Not the Code
#debugging#ops
the leak was a map that only ever grew
#debugging#ops
when in doubt, watch the wire
#debugging#ops
the cron job that fired twice and told no one
#debugging#ops
three days lost to a gap between two lines
#debugging#ops
the cron job that ran twice and told no one
#debugging#ops
the cron job that fired twice and told no one
#debugging#ops
it was the mtu, it's always the mtu
#debugging#ops
the leak was a cache that only ever grew
#debugging#ops
chasing a race condition for three days
#debugging#ops
three days for a missing lock
#debugging#ops
three days chasing a bug that only existed sometimes
#debugging#ops
the day dns took down everything
#debugging#ops
the cron job that ran twice and told no one
#debugging#ops
the backup that overlapped itself and nobody noticed
#debugging#ops
the leak was a map, and the map was me
#debugging#ops
chasing a race condition for three days
#debugging#ops
the bug that only existed when nobody was watching
#debugging#ops
the code was fine, i was the bug
#debugging#ops
small packets fine, large packets gone: it was the mtu
#debugging#ops
it was never the database, it was dns
#debugging#ops
the leak was a map, and the map was me
#debugging#ops
the cron job that ran twice and told no one
#debugging#ops
the leak was a map i forgot to empty
#debugging#ops
it was never the network, it was always dns
#debugging#ops
the outage was a full disk, the disk was full of logs about the outage
#debugging#ops
the code was fine, my mental model was the bug
#debugging#ops
the backup that ran on two boxes and nobody noticed
#debugging#ops
the leak was a map i forgot to delete from
#debugging#ops
the bug that only happened when nobody was watching
#debugging#ops
it's always dns, and this time it really was
#debugging#ops
how a single full filesystem took down a perfectly healthy service
#debugging#ops
the cron job that ran twice and told nobody
#debugging#ops
it was the mtu, it's always the mtu
#debugging#ops
when the logs lie, the wire doesn't
#debugging#ops
when the app, the logs and the dashboards all lied, tcpdump didn't
#debugging#ops
the cron job that ran twice and told no one
#debugging#ops
the outage that was just a disk quietly filling up
#debugging#ops
the bug that only existed when nobody was watching
#debugging#ops
the map that ate all the memory
#debugging#ops
the off-by-one three of us missed
#debugging#ops
the loop that processed every day except the last one
#debugging#ops
the off-by-one three people signed off on
#debugging#ops
the night a logfile took the service down
#debugging#ops
the connection that hung at exactly the wrong size
#debugging#ops
the leak that was just a map nobody ever emptied
#debugging#ops
i was certain it was a race condition
#debugging#ops
the bug that only existed when nobody was watching
#debugging#ops
the leak was a map nobody ever deleted from
#debugging#ops
the off-by-one that three of us missed
#debugging#ops
it was the mtu, it's always the mtu
#debugging#ops
the connection that worked until the payload got big
#debugging#ops
small packets fine, big packets gone
#debugging#ops
the code was correct, my mental model wasn't
#debugging#ops
when in doubt, watch the wire
#debugging#ops
the outage that was just a full /var
#debugging#ops
the outage that was just /var filling up
#debugging#ops
three days for a bug that only happened when nobody was looking
#debugging#ops
it wasn't the database, it wasn't the network, it was dns again
#debugging#ops
the night /var filled and took everything with it
#debugging#ops
the off-by-one we all signed off on
#debugging#ops
tcpdump saved me again
#debugging#ops
the outage was just /var being full
#debugging#ops
the off-by-one we all read and nobody saw
#debugging#ops
when the application logs lie, the wire doesn't
#debugging#ops
the code was fine, i was wrong
#debugging#ops
the off-by-one four of us read and none of us saw
#debugging#ops
the cron job that ran twice and told nobody
#debugging#ops
three days for a bug that lived in a missing word
#debugging#ops
when the logs lie, tcpdump tells the truth
#debugging#ops
the cron job that ran twice and told no one
#debugging#ops
small packets fine, big packets gone
#debugging#ops
the bug that only happened when nobody was watching
#debugging#ops
a fencepost error nobody saw because it looked right
#debugging#ops
the off-by-one that three of us signed off on
#debugging#ops
it was dns, it is always dns
#debugging#ops
the fencepost that three of us read and none of us saw
#debugging#ops
small packets fine, big packets gone, and a tunnel in the middle
#debugging#ops
when nothing was down except the names
#debugging#ops
the bug that only existed when nobody was looking
#debugging#ops
the cron job that ran twice and told nobody
#debugging#ops
half the requests worked, which is how i knew it was the mtu
#debugging#ops
the disk wasn't full, /var was
#debugging#ops
chasing a race condition for three days
#debugging#ops
three days for a bug that only existed when I wasn't looking
#debugging#ops
three days for a missing mutex
#debugging#ops
when in doubt, put it on the wire and watch
#debugging#ops
it's always dns, and this time it really was
#debugging#ops
the disk wasn't full, only the partition that mattered
#debugging#ops
the bug that only existed when nobody was watching
#debugging#ops
when nothing made sense, the wire did
#debugging#ops
the code was fine, i wasn't
#debugging#ops
the outage caused by a full /var
#debugging#ops
when nobody believes the network, run tcpdump
#debugging#ops
three days, one race, and a log line that lied to me
#debugging#ops
the off-by-one that three of us approved
#debugging#ops
it's never dns, until it is
#debugging#ops
when in doubt, put it on the wire
#debugging#ops
the night a forgotten log file took down the lot
#debugging#ops
three days hunting a race condition that only existed under load
#debugging#ops
the day /var filled up and took everything with it
#debugging#ops
the day dns took down everything
#debugging#ops
the outage caused by a full /var
#debugging#ops
the leak was a map i forgot to ever delete from
#debugging#ops
when in doubt, look at the wire
#debugging#ops
the off-by-one three of us read and approved
#debugging#ops
three days hunting a bug that only happened when i wasn't looking
#debugging#ops
when /var fills up and everything gets weird
#debugging#ops
the outage that was just a full /var
#debugging#ops
the report that doubled and nobody noticed
#debugging#ops
the bug that only existed when nobody was looking
#debugging#ops
three days lost to a bug that only happened when i wasn't looking
#debugging#ops
the cron job that ran twice and said nothing
#debugging#ops
the leak was a map i kept adding to and never deleting from
#debugging#ops
it's never dns, until it's the only resolver in the house
#debugging#ops
the bug that fixed itself the moment i looked at it
#debugging#ops
the slow leak that was just a map nobody deleted from
#debugging#ops
a day lost to packets that almost made it
#debugging#ops
the off-by-one four of us nodded straight past
#debugging#ops
the bug that only existed when nobody was watching
#debugging#ops
the night /var filled up and took the whole box with it
#debugging#ops
the off-by-one four of us read and none of us saw
#debugging#ops
it was the mtu. it's always the mtu
#debugging#ops
the bug that only existed when nobody was watching
#debugging#ops
the outage that wasn't down, it was just lying
#debugging#ops
three days for one missing lock
#debugging#ops
the slow leak that was just a map nobody ever emptied
#debugging#ops
the outage caused by a full /var
#debugging#ops
the bug that only existed when nobody was watching
#debugging#ops
it was the mtu, it's always the mtu
#debugging#ops
the outage that was just a full /var
#debugging#ops
the backup that ran twice and corrupted itself
#debugging#ops
a vpn that throttled itself, and yes it was the mtu
#debugging#ops
three days lost to a goroutine that started too early
#debugging#ops
a map with no exit, and the eviction i should have written first
#debugging#ops
the unbounded map, and how i finally went looking for it
#debugging#ops
it wasn't the network, it was the names
#debugging#ops
the cache that grew until the box fell over
#debugging#ops
the overlapping cron job that ate its own tail
#debugging#ops
when a job fired twice because two clocks disagreed
#debugging#ops
the outage caused by a full /var
#debugging#ops
the slow leak that was a cache i forgot to evict
#debugging#ops
it was dns, it is always dns
#debugging#ops
the outage where the disk was full and nobody had noticed
#debugging#ops
the day dns took down everything, again
#debugging#ops
the outage nobody saw coming, because /var was full
#debugging#ops
the outage caused by a full /var
#debugging#ops
it was never the network, it was the resolver
#debugging#ops
the outage that was just a full /var
#debugging#ops
the night /var filled up and took the app with it
#debugging#ops
small packets fine, big packets gone: the mtu strikes again
#debugging#ops
the outage that wasn't the database, it was dns again
#debugging#ops
it was the mtu, it's always the mtu
#debugging#ops
the cron job that ran twice and never told me
#debugging#ops
three days inside a race i couldn't reproduce
#debugging#ops
it was never the app, it was always dns
#debugging#ops
two boxes, one cron line, and a backup that ran in stereo
#debugging#ops
the map that grew until the process didn't
#debugging#ops
the cron job that ran twice and told no one
#debugging#ops
the day /var quietly filled up
#debugging#ops
the job that fired twice because two boxes thought they were in charge
#debugging#ops
the cache that only ever grew
#debugging#ops
the leak was a map i kept adding to and never pruned
#debugging#ops
everything broke because /var was full
#debugging#ops
the leak was a map, and the map was me
#debugging#ops
the night /var filled up and took the lot with it
#debugging#ops
the leak was a map i forgot to empty
#debugging#ops
it was dns, it is always dns
#debugging#ops
the cron job that ran twice and told no one
#debugging#ops
the leak was a map i forgot to empty
#debugging#ops