Deterministic Rollback/Lockstep

Hi,

I wish I could ask about this rollcode/lockstep solution for netcode. I know it is as complex a solution as it is and it seems that projects or learning documents are missing.

There are third-party products, but I honestly don’t see them as viable, such as Quantum. If someone knows another third party, could you let me know?

I’m not concerned with Determinism, but with the logic/algorithm of the rollback/ ockstep and with frame synchronization.

Do you know open source projects? I found some, I have already executed most, but it is as if there were always some flawed problem.

Some projects:
https://github.com/provencher/Backroll
https://github.com/proepkes/UnityLockstep
GitHub - chromealex/ecs: ECS for Unity with full game state automatic rollbacks (not tested)
GitHub - proepkes/UnityLockstep: Deterministic Lockstep with clientside prediction and rollback
GitHub - mrdav30/LockstepRTSEngine: (WIP) Deterministic, Lockstep RTS Engine
GitHub - JiepengTan/Lockstep-Tutorial: 帧同步 教程 (i like this one but unfortunaly not work very well, like jitter in local)

There are some articles on the implementation, but in my opinion they are abstract, it explains a vague concept but for implementation I get a huge blank in my mind.

Anyway, I don’t want a complete solution and everything works magically, I wish I could discuss and learn.

So far, Photon TrueSync has been the best. Easiest to work with, and easiest to implement.

Generally, you just have to replace everything that needs to be deterministic with Fixed Point library that they have provided, and use True Sync’s Monobehaviour. Not sure why that product is discontinued.
The alternative “Quantum” is inaccessible due to the pricing.
Other than that, I have not found any implementation that is as easy and flawless to work with.

DOTS will be deterministic in the future. I see that as a huge fit for lockstep frameworks. Probably best you can do is wait for DOTS.

Dots will never be deterministic. It’s impossible because it is still using floating point.

https://docs.unity3d.com/Packages/com.unity.burst@1.4/api/Unity.Burst.FloatMode.html

2 Likes

There is very little information about actual implementations of this available online, closest you will get is the GGPO source on GitHub.

Also just curious why don’t you consider quantum “viable”?

2 Likes

Deterministic is not my big problem, because I’m more interested in rollback/lockstep feature. Since deterministic is just a consequence to ensure that all clients will be running equally.

I found some TrueSync projects in some repositories, it is not something correct and beautiful to do, but I ended up curing the project - seeing that it is a paid project, I decompiled and tested it. To this day I do a study and also do experiments to decouple the PUN. Because PUN is used only for a Relay Server.

TrueSync there are some problems, some are for not finished the project too:

  1. More than 6 ~ 8 players, exceptions happen.
  2. Physic 3D is unviable, but my interest is only in Physic 2D.
  3. TrueSync is highly engaged for Unity3D, trying to create an authoritative server will need a good headache. Yeah, the quantum price is not viable and I really liked TrueSync for still using MonoBehaviour.

There are some considerations, such as:

  1. The price is too high.
  2. I’m very interested in learning about implementation, rather than becoming a “hostage”.
  3. Besides the price being high, I don’t like a project where I don’t have access to the source code.

Thanks for answering me guys!

2 Likes

Okey, just interested (I’m the inventor/lead dev of Quantum). Best of luck to you.

3 Likes

This is a pipedream, unity will never get that to work.

1 Like

Determinism is not a “consequence”,it’s the most important requirement for lockstep/rollback.

You have to code your game in a way so any state can be reconstructed from a previous state snapshot and a sequence of inputs, identically on all peers. Only then you can reliably do rollback.

The overall idea is:

  • Make your entire simulation deterministic. The same sequence of inputs must always produce the same results.

  • Make your simulation use a fixed time step (no deltaTime). For smooth animation you’ll either have to interpolate (in case your simulation rate is lower than the display refresh rate) or run at a locked framerate with frame skipping for clients with weak GPUs.

  • Make it possible to save and restore snapshots of your simulation using as little CPU time as possible. This can get tricky when dealing with spawning and destroying stuff, and with things like sounds and particles.

  • Optimize your simulation code so you can simulate your entire rollback window in a single frame. Failure to do so can cause the game to enter a “spiral of death” where it becomes impossible to catch up after a slowdown. This means there will be a hard minimum CPU spec requirement: a client with a slow CPU will cause the entire game to run slower for all other connected players.

  • Use a custom protocol on top of UDP to send input data between peers. The goal is to synchronize the sequence of inputs between all a peers as quickly as possible. Instead of sending a single input frame per packet, send multiple frames worth of input: this way you can nearly eliminate the need to resend lost and out-of-order packets.

  • It goes without saying, but you need to be able to run your simulation on-demand: when you have enough input data and when you receive input data and need to perform rollbacks. So forget about doing simulation work on standard Unity messages like Update/FixedUpdate, what you’ll be doing is calling a custom method from some sort of simulation manager or similar, that goes through all objects and call their own simulate methods.

  • Kiss goodbye to Unity physics. At most you’ll be able to use overlap checks, but collision and trigger events are not reliable to run in the same order (or at all) during rollbacks. Even overlaps need to be used with care when dealing with multiple colliders.

12 Likes

This is an excellent talk on the subject by Netherrealms:

There’s another great one by Ubisoft Montreal about how they did it in For Honor, but the video itself requires a GDC Valut subscription (the slides are free):

https://www.gdcvault.com/search.php#&conference_id=208&category=free&firstfocus=&keyword=deterministic

5 Likes

@fholm
You are awesome! I had no idea that you participated in Quantum and still the inventor, believed that you participated only in Bolt.

@Neto_Kokku
Thank you very much for answering me.

My main idea for the type of game is something like an online arena, for example a Brawl Stars. So I believe that physics will not make my life so hard, since there are deterministic 2D physics available or working with the idea with grids too.

About UDP, one is related to the framework if you would recommend enet as the protocol to be used? Another doubt, the idea of sending data inputs directly to players has a huge speed gain, but when it is authoritative-server, is it possible? TrueSync uses relay servers for example, it does not distribute packets directly to players.

I confess that I was a little lost, how could I be a simulation manager without being able to use Update/FixedUpdate?

Everything will become more complex, even for the player to reconnect again in the match. xD

Since you need to be able to run he simulation multiple times when preforming a rollback (and there will be rollbacks pretty much every frame),you just can’t have your simulation objects updating themselves willy nilly on Update/FixedUpdate. You need a mechanism for telling everyone to advance a simulation step manually, so having everything that is part of the simulation be managed by a central manager-like class is advisable. The manager itself uses Update or FixedUpdate, but it decides if and how many times everything else ticks.

Reconnecting and joining ongoing matches also requires extra work. Depending on the length of your matches and how many steps you can afford to simulate per second, replaying the match all the way from the start may not be viable. In this case you’ll need to send a simulation snapshot to the connecting client so they have less frames to catch up with. That’s how For Honor does it. If you’re going to use dedicated servers this is easier to do since you won’t be at the mercy of the players’ upload speeds.

The role of a dedicated server in a lockstep game is also a bit different, since the clients are mostly only sending and receiving inputs. The dedicated server itself doesn’t need to rollback, since it only runs the simulation to keep an authoritative state of the match to twarth cheating. It can also help against the inevitable desync errors that you not catch during development: if it detects a checksum mismatch in a client’s frame, it can push a snapshot to that client in hope that can get back into sync.

And unless you’re making a board game like chess or checkers, you will have desyncs. Stop thinking you don’t need to worry about them. A shooting game like Brawl Stars is definitely going to have desyncs during development, specially if this is your first lockstep rodeo. Something as simple as two objects updating at a different order in each client can easily escalate into something major like a player dying in one client but being alive in the other.

You will have to code tools to detect and debug desyncs, or your sanity will be at risk. First order is to calculate a checksum of your snapshots from “complete” simulation frames (the ones where you had all players’ inputs available) and send them to the server for comparison. The server can then compare them against its own checksums to see if the simulations match. If there’s a mismatch, you can have the client send the bad frame snapshot to the server so it can be stored for analysis and you have a chance at figuring out what happened.

5 Likes

Some really good answers in here. To elaborate on floating point determinism and PhysX: PhysX itself actually guarantees determinism for same machine, same build, different runs. In practice, I have found this tends to extend to different machines with the same architecture (i.e., two Intel machines will be able to run it deterministically). But that’s the end of the line for PhysX (for AMD and Intel likely due to SIMD differences). So if you want players with AMD to play with Intel players (I can’t imagine why you wouldn’t), you cannot use PhysX in your game.

Next up is Unity. Unity does a bundle of stuff in its object lifecycle that is non-deterministic, like destroying objects (it’s done during the Update loop). Unity is a black box, so you cannot edit its source to make it determ, and must rely on workarounds.

So you write your own object lifecycle and are looking to pick a new physics engine, which is likely going to use floats (Quantum does have its own Fixed Point physics engine. Fixed Point comes with its own problems, I’m not sure if the Quantum devs have written about it anywhere). Gaffer on Games has an excellent article citing a ton of devs on whether that is achievable.

Worth stating that even a single bit difference in the position of one object in your game will make it heavily desync very, very, very shortly.

This dev actually implemented state serialization ahead of time to help track down desyncs, which has the nice side effect of allowing rejoin via snapshot.

Has there been any update on this? I have assumed they’re going to start with compiling FP in a strict mode, but I don’t know whether that is cross platform. Would be interested to see their progress.

3 Likes

I found FP in strict mode pretty reliable on x86/x64, but I’m unfamiliar with how well it goes on SIMD. On a lockstep-based Unity game I worked in the past, we hit an AMD-versus-Intel desync involving Unity’s Mecanim. Curiously, only one character was affected and the difference was always 1 single bit, but it forced us to refactor things to remove the influence of Unity’s bone animations on the simulation.

Quantum sounds great, but the price is very high for indie dev’s (1000$ per month?!) and i don’t get any response from my email about the quantum trial version. I hope i can test it, if its good for my projekt or not… :-/

… there is currently no simple other solution for me (really sad). I hope DOTS is deterministic soon…

I think rollback is pretty terrible in lockstep scenarios, that makes it super expensive. It’s pretty terrible in fps gameplay, too.

In complex simulations, moving a deterministic simulation state across the net isn’t feasible; which is why large scale RTS work in lockstep and run the sim on all clients, locally, and just exchange sim command streams.

Advantage: Cheating (by breaking game rules) impossible, of course fog hacks are a possiblility.

Disadvantage: Minimum latency to all commands. But you can be sure, 100000% sure at any given time that that what you see is the actual state of affairs. That’s worth 200-500 ms for me (i.e. in games like Supreme Commander: Forged Alliance), but in reality a server could dynamically detect player latencies and reduce it to sub-100ms times, which suddenly are in the same league as input lag. Not great for an fps, but fine for RTS.

In my implementation of a lockstep protocol, I do a “grace period” lookahead, basically the only thing where it would really be felt is shooting, and what I do is I spawn a proxy entity before that will then get “approved” by the server and spawned “in the future”, replacing the proxy from the present. Before, it simply doesn’t interact, but in ~100ms, it would not hit anything anyhow.

What you mean is you do client-side prediction, but without allowing the client proxy to affect the simulation? Not sure that would hold up too well, specially if it’s a game with AI entities the players are expected to interact with.

Rollback is basically a generalized way (aka: brute force) to perform client-side prediction on lockstep-based multiplayer games, done by guessing the other players’ inputs instead of waiting for them and correcting the simulation when they arrive. Sure, there are other ways to do prediction, but in a deterministic game you have less options since you need to keep a synced version of the simulation somehow.

I’e be interested in seeing examples of how people implement rollback in practice - how they structure non-trivial game code to support it, especially when rolling back spawned/despawned objects. Guessing that VFX/audio gets quite tricky, too. Sounds a bit of a nightmare, really.

(Actually, it’d also be interesting to know which other games have successfully used a rollback system that aren’t 1-on-1 fighting games?)

While the concept of lockstep-without-rollback is simple, doing it in Unity can already negate many of the advantages of using Unity, depending on your games requirements. You’ve got to deal with a fixed-timestep simulation and doing your own interpolation for variable-timestep rendering, and can end up reinventing the wheel repeatedly (if using fixed point, there’s many Unity systems that become unusable - including things you may take for granted like physics/raycasting/pathfinding)

Photon Quantum sounds impressive, but £1k/month just to try it puts it out of reach of hobbyists and most indies.

Even though I am not making an FPS; I always preferred UT99 and Q3A’s netcode to what modern games like Battlefield 3 and later do. Just so many, too many deaths after having beein behind cover for half a second or more, just because someone with a worse connection gets to decide the truth for you. (the problem with rollback / backward reconciliation is, not only can it be gamed with cheats, but by its very design the player with the worst system has the greatest sway over the reality of the game, basically literally “the last word” about what is true)

You press the button and 2 network latencies and a partial frame latency, you see the ground truth. Strangely, I did very much like Counter-Strike 1.6’s prediction, interpolation, and backwards reconciliation. No other game has come close yet. (players back then would also tweak their cl_interp settings a lot because some really preferred the truth over the ideal story)

Yeah. Well. “Client Side Pretension”, amd the server makes it true, more like. :slight_smile:

Disadvantage is, other clients will seem to spawn projectiles a little ahead of where they should. I Mask this by having most weapons have windup times and large muzzle flashes.

I think a hard rollback of 2 or more simulation steps, or even as few as one, even on the client, would only work in sims that have very little state to begin with.

i.e., not an RTS or open world with thousands of units and rapidly switching loci of player interaction, unless you have a way to partition your sim state into some kind of spatial causality

… which could be awesome.
… which… could actually work in my system.

Best of both worlds perhaps. Thanks for giving me more insane ideas!

Ubisoft’s For Honor, as mentioned before. There are GDC talks about it.

Rollback lockstep is mostly used in fighting games, yes (it’s pretty much the gold standard for fighting game net code), but it’s generalized nature makes it very good for games with both twitchy controls and which are hard on replication (large amounts of non player entities).

Check the Netherrealms GDC talks, they cover spawning, sound and VFX rollback. The idea is keeping spawned elements around for the duration of the rollback window, but deactivated during rollback frames where they aren’t supposed to exist.

About interpolation, you should be doing that anyway in some fashion with any decent net code, since you need to interpolate/extrapolate to match newly received server state.