Ramblings of an aging IT geek
← Ramblings of an aging IT geek
networking

it's never dns, except when it's mtu

How a mismatched MTU on a homelab VPN produced a fault that let small packets through and silently swallowed large ones, and how to find it.

Network patch cables in a rack

The worst network faults are the ones where the network mostly works. Ping is fine. DNS resolves. SSH connects and you can type away happily. Then you try to git clone something large, or load a page with a fat response, or copy a file over the VPN, and it hangs forever with no error. That's the signature of an MTU problem, and it cost me an evening before the penny dropped.

Here's the short version of what MTU is and why it bites. Every link has a Maximum Transmission Unit, the largest packet it'll carry in one piece. Ethernet is classically 1500 bytes. The moment you wrap traffic in something (a VPN, a tunnel, PPPoE on a domestic line, an overlay network) you add headers, and those headers eat into the payload. If the inner stack still thinks it can send 1500-byte packets but the tunnel can only carry, say, 1420 bytes of payload, something has to give. Normally that "something" is fragmentation or a polite ICMP message telling the sender to use smaller packets. When that mechanism breaks, large packets simply vanish, and nothing tells you.

why it presents as "small things work, big things hang"

This is the tell, and it's worth understanding because it points straight at the cause. A TCP connection opens with small packets: the handshake, the request line, a few headers. Those fit under any reasonable MTU, so the connection establishes and feels healthy. Then the server starts sending the actual payload in full-size segments. Those are too big for the constrained link. If Path MTU Discovery were working, the sender would get an ICMP "fragmentation needed" message and back off to a smaller size. But ICMP is the first thing a nervous firewall admin blocks, and the instant those messages are dropped, you have a PMTU black hole: the big packets die, no feedback comes back, and the sender keeps retransmitting the same oversized segments into the void.

So: handshake works, headers work, the bulk transfer stalls. Every single time I've seen that exact pattern, it's been MTU.

Network cabling and switches in a server room

finding it

ping is the tool, with two flags you may not use often: a size, and "don't fragment". On Linux:

# -M do = don't fragment, -s = payload size
# 1472 + 28 bytes of ICMP/IP headers = 1500
ping -M do -s 1472 10.0.0.1

If that 1500-byte packet gets through, your path supports a 1500 MTU. If it fails with "message too long" or just times out, start bisecting downwards. Knock it back until it succeeds:

ping -M do -s 1392 10.0.0.1   # 1392 + 28 = 1420

When you find the largest size that succeeds, add 28 and that's your real path MTU. In my case the answer over the WireGuard tunnel was 1420, because WireGuard's encapsulation overhead trims the usable payload, and I'd left every interface at the default 1500. The tunnel happily carried small packets and silently dropped the large ones the moment a real transfer started.

fixing it

There are two honest fixes and one bodge.

The clean fix is to set the MTU correctly on the tunnel interface so the stack knows the true limit and sizes its packets accordingly:

ip link set dev wg0 mtu 1420

Set it on both ends, persist it in your interface config, and the problem is gone properly. The OS now generates appropriately sized packets and never offers the link something it can't carry.

The second fix, useful when you can't control every host behind the tunnel, is MSS clamping. You tell the router to rewrite the TCP Maximum Segment Size option on the SYN packets as they pass, so each end negotiates a segment size that fits the smallest link on the path:

iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN \
  -j TCPMSS --clamp-mss-to-pmtu

That --clamp-mss-to-pmtu is the pragmatic homelab favourite because it works without touching the clients. It only helps TCP, mind. UDP gets no such courtesy, which is why a VoIP or game stream can still misbehave even after you've clamped TCP into shape.

The bodge, which you'll see suggested everywhere, is to just lower the MTU on the LAN-facing interface to something small like 1400 and walk away. It works, sort of, at the cost of slightly worse efficiency on every packet for the sake of the occasional tunnelled one. I'd rather fix the actual link.

the lesson I keep relearning

Allow ICMP, specifically type 3 code 4, "fragmentation needed". The reflex to block all ICMP because "ping is a security risk" is how you break Path MTU Discovery for your whole network and turn a self-correcting mechanism into a silent black hole. That feedback message is load-bearing. Drop it and you've signed up for exactly the evening I just had.

It's never DNS. Until it is. And when it isn't DNS and the small stuff works but the big stuff hangs, it's MTU, and now you know which size of ping to reach for.