the network that worked, except when it really mattered

A tangle of network cables behind a rack

The worst kind of network problem is the one where ping works. Ping works, SSH connects, you log in and poke around and everything seems fine, and yet something is deeply, intermittently broken in a way that defies every quick check you reach for. Nine times out of ten, in my experience, that something is the MTU. It's the silent killer because it doesn't fail loudly, it fails selectively, and the things it lets through are exactly the things you test with.

Here's the one that cost me an evening. I'd set up a VPN tunnel between my homelab and a small VPS, so I could reach internal services from outside. Brought it up, pinged across it, logged in over SSH, browsed a couple of internal pages. All good. Then I tried to git clone a decent-sized repo over the tunnel and it hung. Not failed. Hung, at a seemingly random percentage, every single time.

why small things work and big things don't

The clue is in what works versus what doesn't. SSH handshakes, pings, the start of an HTTP response: all small packets. A git clone or a large file transfer: big packets, full-size frames stuffed to the brim. When small traffic flows and large traffic stalls, you are almost always looking at an MTU problem somewhere in the path.

MTU is the largest packet a link will carry. Standard Ethernet is 1500 bytes. The trouble with a tunnel is that it wraps your packet inside another packet, and that wrapper costs bytes. A typical VPN encapsulation eats somewhere between 40 and 100 bytes of overhead. So a 1500-byte packet handed to the tunnel becomes 1540-odd bytes on the wire, which the underlying 1500-byte link can't carry whole. It has to fragment it, or drop it, and that's where it all goes wrong.

A datacenter aisle lined with servers

why it hangs instead of failing cleanly

In a sane world the sender would learn the path can't take a big packet and send smaller ones. That's Path MTU Discovery, and it works by the offending router sending back an ICMP "fragmentation needed" message saying "too big, try this size instead". The sender shrinks its packets and life goes on.

The problem is that an enormous number of firewalls block ICMP wholesale, on the theory that ICMP is "ping" and ping is "hackers". So the "too big" message never comes back. The sender keeps cheerfully firing full-size packets into a link that silently bins them, the small packets that set up the connection got through fine, and the transfer just... stalls. This failure mode has a name, the ICMP black hole, and it is responsible for an unreasonable share of the grey hairs in this industry. The connection isn't refused. It's accepted, and then it dies in the middle, and everything in your instinct says "the server's slow" when the truth is "your packets are too fat and nobody's allowed to tell you".

how to actually find it

The test I reach for first is ping with the don't-fragment bit set and a deliberately large payload. On Linux:

# 1472 payload + 28 header = 1500 bytes, the standard MTU
ping -M do -s 1472 10.0.0.2
# now push past it and watch it break
ping -M do -s 1500 10.0.0.2

If the first works and the second comes back with "message too long" or just silence, you've found your ceiling. Walk the payload size down until pings get through, add 28 for the IP and ICMP headers, and that number is the real MTU of the path. For my tunnel it landed around 1420, which exactly matched the encapsulation overhead I'd forgotten to account for.

the fix, and the better fix

The direct fix is to set the tunnel interface's MTU to the value you measured, so the sender never builds a packet too big to fit:

ip link set dev tun0 mtu 1420

That works, but it relies on every host knowing to use the smaller size, which isn't guaranteed for traffic routed through the box. The more robust fix is MSS clamping, which makes the router rewrite the maximum segment size in passing TCP handshakes so both ends agree to use smaller packets from the start:

iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN \
    -j TCPMSS --clamp-mss-to-pmtu

--clamp-mss-to-pmtu ties the segment size to the interface MTU automatically, so you set the MTU correctly once and the clamping follows. This only helps TCP, but TCP is the stuff that hangs, so in practice it solves the visible problem.

After clamping, the git clone that had failed a dozen times ran straight through to completion, first try, no hang. Same network, same tunnel, same everything, except the packets now fit. The lesson I keep relearning: when the network works for small things and dies on big ones, stop blaming the application and go measure the MTU. It's nearly always the MTU. And while you're in there, please stop blocking all ICMP at the firewall, because the message you're dropping is the one that would have told you exactly this.