Hi! I’m working on a P2P networking stack that works on following principles:
It use existing P2P network (kademlia / mainline DHT) for initial players connection and matchmaking
It performs NAT punching and establish direct connection between every player in a room. Each player send coordinates and other stuff (RPC messages, controls, etc.) directly to another players, without any server or traffic proxying at all. That leads to a lowest possible latency that is impossible to reach with most other solutions (and a free traffic for any scale).
Downsides of the solution:
Cheaters, and necessity to make dedicated validation servers to deal with them when the project grows.
Connection issues for ~5% players with some corporate VPN servers / complex types of NAT’s, etc. (may be I’ll add some dedicated server to proxy traffic for this players in later versions).
What do you think of the idea? Is anyone interested to try a beta?
Unreliable, used for coordinates transform, etc, everything that can be lost, lowest latency and no confirmation.
Reliable, distributed event log, based on raft protocol. Events commited to log only after receive confirmation from the majority of players in the room. This guarantees delivery and order (by the cost of increased latency).
FPS games can use two mechanisms simultaniously (unreliable for coordinates, reliable for deaths / score / etc.). Strategy game can use reliable mechanism for everything (in this case it’s impossible to use builtin physics / ai, you have to write your own unit movement). There is a FixFloat, FixTransform and a bunch of Fix point classes to help with this.
The main difference that it is serverless. Clients connects directly to each other, you don’t need a server at all. Most other soultions proxy traffic through a central server, so you have to rent it, pay for traffic, you get increased latency (because packet goes at first to central server, and only than it goes to another client).
I’m curious why you’d choose systems typically associated with filesharing networks for use as a matchmaker system.
Directly connecting every client with every client drastically increases the potential points of failure, and leaves no single host/client as an authority over who is connected and who is not.
Probably end up with higher than 5% failure rate is my wild guess. Just a basic example here. Lets say you have a game with 10 players. In a typical host/client scenario you have 9 network connections, 1 connection from each client to the host. If any network connection fails, the host notices and drops the lost client. The client notices and tells the player, giving them the option to reconnect. Pretty clean.
Now take the same game but use your idea. Now each of the 10 clients has a network connection to each other client. I believe that results in 45 network connections. What happens when just 1 of those connections fails? Most of the clients don’t even know there is a problem, you just have 2 clients that aren’t syncing to each other, but they are to everyone else. What if they can’t automatically reconnect? Do you drop 1 of the clients? Both? Who decides? Additionally, what are the chances that all 45 connections are even initially set up correctly in comparison to the 9 network connections to the same host in the first example? My expectation is any game using this set up will quickly generate a reputation for poor network reliability, with a high failure rate for even getting matches started successfully.
There’s also a potential bandwidth issue. In the 10 client example with a traditional host, each client sends its updates to just the host. In your solution, each client sends the same updates to all 9 other clients. That’s a 9x increase in upload bandwidth usage for things like frequent position updates. It is typical of Internet connections to be more limited on upload bandwidth than download bandwidth. That might be enough to negate the inherent performance advantage of no longer needing to relay updates through the host to the other clients.
I think connecting every client to every client has a better chance of success as a LAN focused networking system. Your local network will be more reliable, and won’t actually need to deal with router shenanigans, and the higher upload bandwidth requirements from your system won’t be an issue either.
Well, NAT punch through requires a server for the initial set up, so not exactly serverless. I’m curious if you plan on setting up your own DHT servers using an existing protocol for your games, or if you intend on misusing nodes in a fire sharing network owned by someone else (would have the risk of getting your game shut down if they figured out what you’re doing).
It allows to store short amounts of data for a short period of time (BEP-44 protocol). It’s enough for store room settings and store peers info about NAT punching. In my tests it works pretty well.
What do you mean by “connection”? On a low / middle level you have IP datagram packets (UDP packets, a single UDP socket that listen some port and can send / receive packets to any client). If client have some internet connection issues, he will be unable to send / receive packets to any other clients, how is that possible that only one “connection”.
All other peers just stop receiving any coordinates, after timeout left a leader (elected by RAFT protocol) performs a client drop.
I was telling about optimizing lattency, and not the traffic. It’s true, outgoing bandwidth will be higher, no magic.
I made a several tests, even if you are in the same city as a dedicated server, you get at least X2 less latency compared to directl connections. If the server is in another city or country, latency improvement is much better. Also matchmaker has a “geo-distance” parameter based by IP, if enabled it tries to match players from the same location together.
I’m using existing public DHT network instead of dedicated server (I’m experimenting with two networks, KAD / mainline DHT which allows to store up to 2K for 2 hours, and OpenDHT, which allows to store up to 64K). Clients joins to this networks and helps other peers to perform matchmaking and NAT punching). I’m not “hacking” or abusing networks, it’s the official network standards, this networks are created for developing P2P applications (file sharing is only a single possible application, there are many others, I’m trying to apply it for gaming).
Typically in UDP networking systems for games you build a connection oriented system on top of UDP. That’s all I’m referring to as a “connection”.
How it is possible for just 1 connection to drop is the Internet is a spider web of networking equipment with common failures. My ability to send a message directly to you can temporarily fail, even through both of us can get to a 3rd party, because the link my provider internally uses to send packets your way has failed, but the link my provider uses to get to the 3rd party we can both still talk to is good. These kinds of temporary disruptions on the Internet are extremely common.
You are right, that’s why I’m looking for some beta-testers to check how it will work on real projects.
On a small tests I haven’t faced issues you are talking, may be they will be noticable on large scale.