Ramblings of an aging IT geek
← Ramblings of an aging IT geek
gamedev

the first time unreal's replication made sense to me

Notes from my first real fight with Unreal's actor replication, the authority model, and why the client kept lying about where things were.

A game development screen

I have written a fair amount of networked code over the years, mostly the unglamorous server kind where you own both ends and a packet is a packet. Unreal's replication is a different animal, and my first proper encounter with it left me staring at a door that opened on my machine and stayed firmly shut on everyone else's. This is a write-up of how I climbed out of that hole, mostly so future me has something to read when I inevitably fall back in.

the mental model I was missing

The thing that tripped me up is that there is no single shared world. There is the server's world, which is the truth, and there is each client's local copy, which is a best-effort approximation that the server is constantly correcting. When I clicked the door, my client happily ran the "open" code locally and the door swung. Nobody else's did, because nothing told the server anything had happened. The client had simply lied to its owner and to no one else.

Authority is the word that unlocks this. Every actor has a notion of who is in charge of it, and for almost everything that matters, that is the server. The client is allowed to predict and to ask, but it is not allowed to decide. Once I stopped thinking "this code runs" and started thinking "this code runs on the authority, and then the result is told to everyone", the whole thing stopped being mysterious.

Source code on a screen

replicated state versus events

There are two flavours of "make this happen over the network" and conflating them cost me an evening.

The first is replicated state: a variable whose value the server keeps in sync on every client. You mark it Replicated, register it, and whenever the server changes it the new value arrives on the clients. This is for things that have a value at all times, like health, or whether the door is open.

UPROPERTY(ReplicatedUsing = OnRep_IsOpen)
bool bIsOpen = false;

void ADoor::GetLifetimeReplicatedProps(TArray<FLifetimeProperty>& Out) const
{
    Super::GetLifetimeReplicatedProps(Out);
    DOREPLIFETIME(ADoor, bIsOpen);
}

void ADoor::OnRep_IsOpen()
{
    // Runs on clients when bIsOpen changes. Play the animation, etc.
    PlayDoorAnimation(bIsOpen);
}

The important subtlety: OnRep_IsOpen runs on the clients, not on the server. The server changed the value, so it already knows; it has to call the cosmetic part itself if it wants the animation locally. I lost a good twenty minutes to a door that animated for everyone except the player hosting a listen server, which is exactly this gap.

The second flavour is an event, a one-shot thing with no persistent value, like a gunshot or a footstep sound. That is what RPCs are for, and they come in three directions.

the three RPCs, and which way they point

  • A Server RPC runs on the server when called from a client. This is how the client asks for something. My door click should have been a server RPC: "I would like to open this door, please."
  • A Client RPC runs on a specific owning client when called from the server. Useful for telling one player something private.
  • A NetMulticast RPC runs on the server and on all clients. Good for cosmetic effects everyone should see.

The pattern I should have written from the start looks like this:

UFUNCTION(Server, Reliable)
void ServerRequestToggle();

void ADoor::ServerRequestToggle_Implementation()
{
    // Now running on the server, which has authority.
    if (HasAuthority())
    {
        bIsOpen = !bIsOpen;       // replicates to clients via OnRep
        OnRep_IsOpen();           // and runs the cosmetic part here too
    }
}

The client presses a key, calls ServerRequestToggle, and the rest happens on the authority. The state change replicates out, every client's OnRep fires, and the door opens for everyone, including, finally, the person who pressed the key.

reliable, unreliable, and not flooding the wire

Reliable RPCs are guaranteed to arrive and arrive in order, which sounds like what you always want until you remember that guarantee costs bandwidth and the reliable buffer is finite. State changes and gameplay-critical events get Reliable. High-frequency cosmetic noise, the sort of thing you send many times a second and would not notice losing one of, gets Unreliable. I had everything reliable to begin with, which is the networking equivalent of sending every email as a recorded delivery letter.

what actually caught me out

A few things that the documentation states plainly and that I ignored until they bit me:

  • An actor only replicates if bReplicates is set. Obvious in hindsight, an hour of confusion in practice.
  • HasAuthority() is your constant sanity check. Half of replication bugs are code running on the wrong machine, and a guard at the top of the function would have told me so immediately.
  • The owning connection matters for Server RPCs. If the actor calling the RPC is not owned by the client's connection, the call is quietly dropped. My door was owned by nobody, so the server ignored its requests and said nothing about it.

I am under no illusion that I have mastered any of this. Prediction, smoothing, and dealing with the inevitable disagreement between what the client guessed and what the server decided are all still ahead of me, and I gather that is where the genuinely hard problems live. But the door opens for everyone now, the listen server and the dedicated server behave the same way, and I finally have a mental model that survives contact with the next bug. That is enough of a foothold to keep going.