I have written distributed systems for a living. I know about consistency, ordering, and the fact that the network is a liar. So when I sat down to build a small multiplayer prototype in Unreal Engine 4 over the Christmas break, I assumed the netcode would be the easy bit and the gameplay the hard bit. It was, of course, exactly the other way round, and the prototype spent three weeks teaching me humility one desync at a time.
The idea was modest: two players, a shared arena, things you can pick up and throw at each other. Nothing that should trouble an engine that ships AAA shooters. The gameplay logic came together in a weekend on a single machine. Then I clicked "2 players" in the editor and everything I thought I understood politely fell apart.
authority is not a suggestion
The first lesson Unreal teaches you, repeatedly, until it sinks in, is that there is a server and there are clients and they are not the same thing. On my single machine, every actor was the authority over itself, so any code I wrote "just worked". The moment a second client appeared, half my code was running on a machine that had no right to change anything.
In Unreal terms, the server owns the truth. A client asking for something to happen has to ask the server, via a Server RPC, and then the server decides and tells everyone via replication. I knew this in the abstract. I had read the docs. But there is a difference between reading "clients are not authoritative" and watching a thrown object teleport back to where it started because the client moved it locally and the server, who never agreed, replicated the real position back over the top.
// Wrong: client mutates state directly and hopes
void AThrowable::Throw(FVector Direction)
{
ApplyImpulse(Direction * ThrowForce);
}
// Right: client asks, server decides, everyone hears about it
UFUNCTION(Server, Reliable)
void AThrowable::ServerThrow(FVector Direction);
void AThrowable::ServerThrow_Implementation(FVector Direction)
{
// runs on the server, which has authority
ApplyImpulse(Direction * ThrowForce);
}
Once I stopped fighting the authority model and started routing every meaningful change through the server, the teleporting stopped. The objects became slightly less responsive, because now an action had to round-trip, and that was the next lesson waiting in the queue.
replication, and the things you forget to replicate
Replication in Unreal is wonderfully convenient right up until it isn't. You mark a property Replicated, add it to GetLifetimeReplicatedProps, and the engine quietly keeps it in sync. For position and rotation on a replicated actor, you get a lot for free.
What you do not get for free is everything else, and the bugs come from the things you forgot. I had a "held object" pointer that was perfectly correct on the server and stubbornly null on the clients, because I had never told it to replicate. The visual result was a player holding nothing while clearly carrying something. I spent an embarrassing evening on that one, convinced it was a physics problem, when it was simply a variable that existed in only one of the two worlds I was now running.
The mental shift that helped: stop thinking "my game state" and start thinking "the server's game state, and a copy on each client that is only as correct as I have bothered to make it". Every piece of state has to earn its replication, and anything cosmetic that I could derive on the client, I did, to keep the wire quiet.
The other thing that bit me was order. Replication is not transactional; properties arrive when they arrive, and a client can briefly see a new "held object" pointer before the object it points at has itself replicated in. For one or two frames you have a player holding a reference to a thing that does not yet exist on that machine, which dereferences to null and, if you are careless, crashes. Unreal gives you RepNotify callbacks for exactly this, so you react to a value changing rather than assuming the whole world updated at once. I had to learn to write client code that copes with arriving halfway through someone else's truth, which is a sentence that would have meant nothing to me a month ago.
latency makes a liar of your eyes
The cruellest lesson was last. Everything worked beautifully when both editor instances ran on the same machine with zero latency. So I added artificial network conditions, because the engine lets you, and dialled in a hundred milliseconds of lag.
Net PktLag=100
Suddenly the prototype felt like wading through treacle. Every throw, every pickup, every action waited for the round trip before anything happened on screen. It was correct and it was horrible. This is where real games earn their keep: client-side prediction, where the client does the thing immediately and quietly corrects if the server disagrees. Implementing even a crude version of that for movement was more work than the entire rest of the prototype, and I came away with enormous respect for the people who do it well enough that you never notice.
what it taught me
I went in arrogant. I came out with a much shorter list of things I'm sure about. The network is not a feature you add at the end; it is the shape of the whole thing, and authority, replication and latency are not edge cases, they are the case. My day-job instincts about distributed systems helped more than I expected (treat the client as untrusted, make the server the source of truth) but the real-time, sixty-frames-a-second, "and it must also feel good" constraint is a different beast entirely.
The prototype is still ugly and the throwing still feels slightly off under load. But it works with two players on two real machines across a real network, and three weeks ago I genuinely thought that would be the easy part. I am glad to have been wrong, and I will not be making a multiplayer game any time soon.