I’d like to start a discussion on the DOTS networking framework currently being developed (not just the Transport layer, but also the “NetCode” framework), and hopefully get some devs to share thoughts on where they want to go with this.
Here’s a few questions:
1- I’d be curious to hear what are the fundamental principles behind the network architecture planned in the “NetCode” project:
Is it going towards a Quake3/Overwatch-style architecture with automatic world rollback, prediction & re-simulation for lag-compensation?
Or is it more of a Tribes2/HaloReach-style architecture where lag-compensation is left to handled manually at the benefit of not having the CPU overhead of auto-resimulation?
Are there plans for congestion control, priority systems and network culling solutions?
Will the codegen remain explicit, or will it be converted to a Cecil approach (honestly that’s not a big concern to me, but I’m still curious)
etc…
2- How heavily in-development is the NetCode project? I think people are concerned about this due to the unity+networking situation of the last few years, and it would be great to be reassured. Is there a “frontman/frontwoman” to this NetCode project that we can follow in the forums or on twitter? Is the public “multiplayer” repo the same as the one devs actually work on, or is it just used for curated releases?
3- the other major concern people have, because of the way Unity’s networking announcements suspiciously tend to focus on talking about dedicated servers (something nobody is worried about) instead of actual netcode (something everybody is worried about), is that there will be some artificial restrictions that make us obligated to use Unity’s server hosting. Can you confirm that both the transport and NetCode framework will remain usable with any server hosting solution, as well as self-hosted / player-hosted if we want to? I totally get it if things like matchmaking are unity-exclusive, but our hosting options have to be open. I’ve had to work on a bunch of connected “interactive installation”-like projects that require a LAN setup, as well as confidential game projects that we want to be able to test in LAN before moving to DGS, and I’d like to know if NetCode will support those use cases
I’d also would love to know since I’ve been trying to see where the NetCode is going and (also learn how it works, especially the RPC Layer) so question #2 is something I’m heavily interested/vested in.
For my purposes, the network architecture I’m following on their Github works quite well for me. I’m doing more of LAN with a Server Client Architecture (demo vid below):
This is a broad question and I can’t really cover all the details in a forum post, but I can try to outline the basics.
The basic architecture we are implementing is server authoratitive with client prediction and lag compensation on the server - like most modern FPS games.
For interpolated ghosts (server controlled networked object) which are just displayed at the last position received from the server (minus interpolation time) we want to make it as automatic and easy to work with as possible since there are usually a lot of them and they all tend to have exactly the same logic.
For predicted ghosts which the player have control over - that needs to be displayed at the tick the server will be at when it receives the input from the client - we are heading in the direction of automatic prediction/re-simulation, but it is still TBD exactly how automatic it will be. Right now it is only semi automatic, it rolls back for you and there are some utilities for re-simulating but you need to handle much of it and make sure it matches the server simulation yourself. This is something we are planning to improve, at the very least we will make the utilities easier to use and make it easier to share code between server simulation and prediction.
The lag compensation on the server is mostly about knowing which tick the client saw for interpolated ghosts and raycast against that when firing hitscan weapons so you don’t have to lead your targets, it is not implemented yet.
We also have support for predictive spawning, where you can spawn entities on the client you think the server is going to spawn. The server will take ownership of the entitiy when the snapshot arrives and it will be automatically destroyed if it does not arrive. The matching function from incomming server snapshots to predictivly spawned entities is something you need to implement yourself.
The core of the NetCode where we have spent the most time is an efficient way of processing large amount of entities, prioritizing them per ECS chunk and ghosting them to the clients with delta compression.
So the base of the priority system is already there. Each ghost type calculates an importance factor for a chunk of entities (if you want to prioritize on distance you need to partition your entities spatially in chunks with shared components and calculate bounds per chunk right now, per entity prioritization is something we will investigate if we find a use-case where it is required).
We send snapshots with a fixed bandwidth, once we know all the chunks we sort them based on importance, fill up a packet with the most important entities and send it. Importance values are always scaled by time since last send, so anything which did not fit will have higher importance next frame. We will make this more configurable going forward, so you can reduce package size or send rate as a response to congestion, and also provide the data to make those decisions.
We are not planning to move the codegen to cecil, we plan to keep generating cs files. The main reason is debuggability. Once you generate assemblies rather than code it is much harder to debug since you can’t just set a breakpoint or add a Debug.Log somewhere. Another reason is that we want to make sure it is possible to override the generated code if you want to do something custom for a specific type, so you can generate the code as a starting point and then hand-edit it.
Another important pillar of the NetCode which is often forgotten is tooling and workflows. We have experimental workflows for playing multiplayer games with multiple clients and in-proc server in the editor and tools for visualizing the data flow and composition of snapshots. We think these kind of things are critical for the new NetCode to succeed and which we will continue to improve them.
I can quantify the development investment in a meaningful way. We have a dedicated DOTS Multiplayer team working on it which I am leading. It is staffed with developers who have been working on DOTS since long before it was called DOTS and with substantial experience from big AAA multiplayer games.
The public github is not the main repo yet. Since we pushed the NetCode there we try to push to it as soon as we integrate significant feature branches to master on our internal repo. We are also going to do some development in real games since we want to drive the development by real use-cases, but we will try to keep pushing the the public repo frequently regardless of where we are doing the active development.
Server hosting, matchmaking and other services are developed by a different team, which tend to do more announcements since they have a more mature product. We have no plans to require any of those services for the transport or NetCode.
Being able to run the game with a server and client in the same process is IMO required to make good development workflows which - as mentioned earlier - is something we consider imporant. And as you point out, for many projects it is a strict requirement to run with a lan server during development (I would say it is required for practically all projects since you need to be able to debug and profile your server locally)
I think productive flows are really the hard problem here. The rest is fairly well known, even if it’s time consuming to get the details right.
One key thing most everyone new to this gets wrong is that server flows are necessarily and correctly different from the client. You can still get code reuse, but it needs to happen at the right abstraction layer. So while you might very well share the same data structures on client and server, the ECS systems would correctly look quite different on a per feature basis.
Currently my favored approach is abstract core logic in a way that makes it easy to plug that into the client as a form of mocking server functionality. Although in this case the mocks are actually the real logic shared by client and server. This is primarily for the 90% of features that aren’t on the hot path per say, but where most of your dev time ends up going in a larger complex multiplayer game.
With core logic abstracted towards this goal, then an approach of mocking server functionality is very simple and can be done with very few lines of code per feature. In our game which is a rather large complex mmo, outgoing messages are redirected to a mock server class. It’s a single file. 90% of features are mocked with a single call into shared logic and the creation of the response message, so 2-4 LOC on average to mock out an entire feature. Response is then injected back into the incoming message routing
Having to run a client and server for doing most of your iteration is a less then productive flow. With the above approach I can work for days on a feature without using the server, with a high degree of confidence that it will mostly just work when I do the client/server testing. Of course this assumes some degree of unit testing, but even that is faster when it’s outside Unity.
Client and server in the same process has a lot of the problems that mocking/stubbing were designed to fix. It’s better then having two processes but I don’t think it’s the best solution we have available.
One use case that I think is interesting to think about is the problem Glenn Fiedler describes here (but disregard the solution he opted for, because it would not be suitable for a game where cheat-prevention is important). In short: how to make a responsive online game where players are all controlling dynamic rigidbodies that can interact with eachother
For a game like that, I’d imagine the “ideal” setup would look like this:
Start with the assumption that our physics simulation is “deterministic enough”
Whenever the server receives inputs from clients, revert world to the tick of the earliest of those received inputs and resimulate to present using all the inputs we know of, or predicted inputs otherwise
Whenever a client receives world snapshot from the server, revert world to that snapshot’s tick, apply it, and resimulate to present using our own inputs as well as predicted inputs for other players
Basically; whenever either the client or the server receives any data from eachother, revert world to the tick of that data and resimulate absolutely everything to present. It would have a much heavier CPU cost than other network architectures for sure, but this is a case where I’m hoping DOTS performance might help make this a viable option for some games.
I’d totally understand if NetCode decides to not support such use cases, but I still think it might be worth thinking about if there are ways to structure NetCode in a way that would make this strategy easily implementable by users if they need it. I think it would also be very helpful for lag-compensation of non-hitscan projectiles
And by the way, don’t hesitate to tell me if I’m wrong about this stuff. My online game programming experience is limited
Our focus is on the FPS netcode architecture Tim describes since it is the most commonly used in games and the most broadly applicable. Is fair in competitive play, safe against cheating etc.
We do believe that flexibility with netcode architectures is important. Different games sometimes need different netcode architectures, for different tradeoffs. There is no one size fits all for netcode. So we are building all simulation systems to enable that flexibility.
A large amount of the work for good netcode is ensuring that all subsystems are built with the right principles. Performance by default, simulation determinsm by default, complete & performant access to all state / data is obviously the primary pillars that make it so that you can build your own custom netcode architectures on top of ECS yourself without fighting the system…
Over time we want to provide examples of different architectures like LockStep deterministic for RTS or GGPO for fighting games. For now the focus is on providing a great FPS netcode architecture & examples.
We use multiple worlds for server & client. This allows for easy debugging in the same process. The asteroids demo even lets you switch between multiple client visualizations quickly to ease debugging of multiple players all in the same process.
This seperation is a best practice used in all Major AAA FPS games. Even the single player missions in FPS shooters usually run through the normal client / server model, simply using memcpy instead of sockets to copy the data.
This is to ensure that all game play programmers always test the same systems, to avoid the issue that there is a single network dude on a team who gets to fix everyones code because everyone else can just “happily ignore the annoying details of making game code work with the netcode”.
This is all what I expected however for a large number of Unity customers/users there will still be monobehaviour projects, so is there a plan for a bridge? I can’t really use the word “hybrid” if I’m honest as it doesn’t really describe what functional purpose there is.
My guess is that getting data from the DOTS world in order to represent it in the GameObjects world will be easily doable, but implementing simulation logic in monobehaviour is another story. You’d at least need to use a custom update that is called from the main DOTS simulation world and does the following:
get the data from the right entities in the DOTS world
do some logic
set back the data to the right entities in the DOTS world
I think the real requirement is this: there is a specific point in the DOTS simulation loop where all of your Entities have to be properly updated by your gameplay logic, so that the DOTS snapshot/serialization systems can then do their job
I gotta admit, though; part of me hopes that things like NetCode, VisualScripting, Subscene Streaming, etc… just remain 100% focused on DOTS in order to accelerate the community’s transition to DOTS as soon as possible
I think DOTS has a stigma, a fear attached to it that will prevent any migrations en-masse for quite some time. Firstly, it’s still in development and secondly it’s a new way of thinking, so that kind of situation can’t IMHO happen until at least visual scripting is fully there too - people will generally adopt without too much friction at that point…
Sorry for slightly off topic.
I’m trying to combine DOTS stuff into one project, multipalyer, phyiscs mainly. So just adding the two caused physics not to be run on any worlds. So I modified the Physics packet code to add the SimulationGroup attribute, but quickly realized the worlds will need the Transform systems as well. How is this all planned out to work in the end with packets that might or might not depend on Client/Server Simulation player loop structure? Are/will the developers be required to manually create bootstraps to filter all systems from all packets in this sense if they should be added to simulation or not?
We noticed the same thing recently, it is not how we want it to work and are currently investigating how to best deal with it. We are not there yet - but in the end you should not have to write a custom bootstrap nor modify packages in order to run the default or package systems in the client or server world.
I looked through the alpha code, and I see things aren’t really settled in terms of synchronizing general data. For example, diffing isn’t really a thing here. Am I mistaken?
Updated my project with the new things. Neat to see that systems are by default added to both server and client now. What I still dont like is the Archetype usage to create ghosts entities, a player usually have children etc, and from what I understand it can’t be represented in an archetype? (maybe I’m wrong here) And that we have to manually write the Archetype for the server instead of something generated from/using the ghost prefab
Edit: I guess you could create the avatar of the ghost object delayed as a solution as well.
We do want to support ghosts consisting of multiple entities, but we have not implemented that yet.
Which archetype usage is it you do not like?
The archetype usage to instantiate ghosts on the client is optional - if you have a reference to the prefab in the client world (which you can get by creating a GameObject with GhostPrefabAuthoringComponent and ConvertToClientServerEntity set to Client only) it will instantiate the ghost prefab instead of using the archetype.
You can also use the ghost prefab to instantiate entities on the server.
We do not support finding ghost types with multiple entities on the server, nor serializing / deserializing ghosts with multiple entities yet though. We also have not tried creating ghost prefabs with multiple entities yet, so it might not solve your root issue today.
Sweet, didn’t see GhostPrefabAuthoringComponent before. Saw that you added a quickguide as well now with some bits. I don’t require any serialization of data on the child entities for now, so I think it will solve it. : )