Receive queue is full, some packets could be dropped, consider increase its size

Hello,

We are seeing the following error messages on our Linux Dedicated Server hosted in Unity Cloud, which is causing server disconnects and crashes. The crash happens a couple of times per day, and I haven’t been able to detect a pattern.

Failed to decrypt packet (error: 1048580). Likely internal TLS failure. Closing connection.
Failed to establish connection with the Relay server.
Transport failure! Relay allocation needs to be recreated, and NetworkManager restarted. Use NetworkManager.OnTransportFailure to be notified of such events programmatically.```

In an attempt to resolve the issue, I've increased the "max packet queue size" on the Unity Transport on the NetworkManager but the crashes still occur. I'm not sure if the change should actually take place on the created Relay or not. 

I've also tried the following: 

```csharp
            var settings = new NetworkSettings();
                settings.WithNetworkConfigParameters(
                    sendQueueCapacity: 1024,
                    receiveQueueCapacity: 1024);

            var driver = NetworkDriver.Create(settings);

Notes

  • This can happen when only 1 player is online
  • Clients are joining from WebGL
  • We are using a Dedicated Server and Relay with Websockets to allow WebGL

Packages

  • Netcode for GameObjects 1.7.1
  • Multiplay 1.1.1
  • Relay 1.0.5
  • Lobby 1.1.2

More info on our setup that allows WebGL here:

Curious if anyone has any suggestions. I’m unable to find many resources on this issue.

Server Setup Code

            await UnityServices.InitializeAsync();

            // restart server if transport crashes
            NetworkManager.Singleton.OnTransportFailure += OnServerTransportRestarted;

            if (!AuthenticationService.Instance.IsSignedIn)
                await AuthenticationService.Instance.SignInAnonymouslyAsync();

            var serverConfig = MultiplayService.Instance.ServerConfig;
            Debug.Log($"Server ID[{serverConfig.ServerId}]");
            Debug.Log($"AllocationID[{serverConfig.AllocationId}]");
            Debug.Log($"Port[{serverConfig.Port}]");
            Debug.Log($"QueryPort[{serverConfig.QueryPort}]");
            Debug.Log($"LogDirectory[{serverConfig.ServerLogDirectory}]");

            string ipv4Address = "";
            ushort port = serverConfig.Port;
            NetworkManager.Singleton.GetComponent<UnityTransport>().SetConnectionData(ipv4Address, port, "0.0.0.0");

            int maxConnections = 100;

            Allocation allocation = await RelayService.Instance.CreateAllocationAsync(maxConnections);
            NetworkManager.Singleton.GetComponent<UnityTransport>().SetRelayServerData(new RelayServerData(allocation, "wss"));
            var joinCode = await RelayService.Instance.GetJoinCodeAsync(allocation.AllocationId);
            Debug.Log("Relay Join Code: " + joinCode);

            //        var settings = new NetworkSettings();
            //        settings.WithSecureServerParameters(serverCert, serverKey);
            //        NetworkDriver.Create(settings);
            //        NetworkManager.Singleton.GetComponent<UnityTransport>().SetServerSecrets(serverCert, serverKey);

            var settings = new NetworkSettings();
                settings.WithNetworkConfigParameters(
                    sendQueueCapacity: 1024,
                    receiveQueueCapacity: 1024);

            var driver = NetworkDriver.Create(settings);

            try
            {
                var createLobbyOptions = new CreateLobbyOptions();
                createLobbyOptions.IsPrivate = false;
                createLobbyOptions.Data = new Dictionary<string, DataObject>()
            {
                {
                    "JoinCode", new DataObject(
                        visibility: DataObject.VisibilityOptions.Public,
                        value: joinCode
                    )
                }
            };

                Lobby lobby = await Lobbies.Instance.CreateLobbyAsync("Explore", maxConnections, createLobbyOptions);
                string lobbyId = lobby.Id;
             
                StartCoroutine(HeartbeatLobbyCoroutine(15, lobbyId));
            }
            catch (LobbyServiceException e)
            {
                Debug.Log(e);
            }

            NetworkManager.Singleton.StartServer();

            if (serverQueryHandler == null)
                serverQueryHandler = await MultiplayService.Instance.StartServerQueryHandlerAsync(100, "xx.xxx.xxx.53", "Explore", "54827", "Forest");

I would implement a packet logging mechanism. Netcode is has its source available to you, so it should be possible to find the place where it is receiving packets and log each packet with a timestamp and its size, plus periodically logging any additional info like the size of the receive queue.

You should (selectively, configurable) do this also on the client side because it seems to indicate that the server is receiving too many packets at once. Consider cases where a client might send too many packets per frame, or many packets over consecutive frames, and also determine the size of packets.

Netcode also has some utilities that allow you to see network stats. This should be integrated as a debug view into your game if Netcode doesn’t already have a means to overlay these stats.

I don’t know how experienced you are with networked games, so I just put this out there in case you haven’t given bandwidth consumption much thought. :wink:

There’s a number of script examples posted by users that indicate a certain lack of understanding into how easily one can bloat either the number of packets or the size of packets. A common example is sharing a collection as a NetworkVariable, which means every time one item in the collection changes, it may actually send the entire collection depending on how it gets serialized. Another issue is running the server at an unusually high tick rate - for most games the default tick rate of 50 Hz is already a bit overkill but some users set this to 120 Hz and even more! Or synchronizing every bullet or particle effect, rather than one-shot starting those for every client who then simulate these locally because their behaviour is deterministic and/or only a visualization that doesn’t affect gameplay.

Check your network code and reason about how often networkvariables change and get synchronized, and how many RPCs a client could possibly send in a single frame. Multiply the latter by number of max. connected clients. If a client could send out 50+ packets and you have max. 8 clients … you can do the math. :wink:

Thank you for the suggestions. A few more notes:

  • This is a very simple setup with mostly default values (outside of WebSockets enabled for WebGL)
  • Using Client Network Transform Script (simple transform syncing)
  • No Network vars used yet
  • [ClientRpc] and [ServerRpc] are used to send some basic state information

I experimented with a networked Unity game in 2015 during a college summer break, but outside of that I’m just now learning the new Netcode for GameObjects.

When using the network profiler in Unity from the client, we’re seeing network activity of 20B being sent out as the player is moving within the game.

I’ll also work to experiment with the server-side logging of packets and report back.

9626912--1367312--upload_2024-2-6_17-57-2.png

Which version of the Unity Transport package are you using? I remember seeing such TLS-related crashes with secure WebSockets and I made a few fixes around that. Latest version is 2.2.1 for your information.

Also, as a workaround you could try not using WebSockets on the server side. Unity Relay supports brokering connections where the different ends do not share the same connection type. So you could for example have your WebGL clients create allocations with the “wss” connection type, but your Linux server could use “udp” or “dtls”.

1 Like

Thank you for the suggestions.

We were using 2.1.0 for Unity Transport, and now on 2.2.1. We’ve also removed websockets and wss from the server side.

I’ll monitor the server to see if these crashes still occur, and report back on the results.

9628415--1367726--upload_2024-2-7_8-1-20.png

This appears to have solved the problem. Thank you for the guidance :slight_smile:

An update here, the server no longer crashes when this error occurs, however, we still get the message occasionally in the server logs:

“Receive queue is full, some packets could be dropped, consider increase its size (256).”

This is happening with only 1-3 players online.

What would be the root cause of this error message? Is it possible that it’s somehow related to a WebGL build? Monitoring the outgoing packets from the client nothing appears out of the ordinary.

What is a reasonable number to increase the “max packet queue size” to?

The root cause is that during that frame, more packets were received by the socket than we can read in our receive queues. Generally that’s not a cause for concern. The packets will simply wait in the socket’s buffer before being read on the next update.

Now the logical next question is why would so many packets be received in a single update? Typically this is due to having many clients send a lot of data over a short time frame, but the low number of players online here would tend to indicate that this is not the case. I’m thinking you might be dealing with a bug in our reliable pipeline that we fixed in Unity Transport 2.2.0.

Basically, the bug caused the reliable pipeline to resend packets uselessly in some circumstances. That could end up generating a lot of superfluous traffic. Since these resends happen at a lower level than our network profiling tools, they would not appear in your outgoing packets (unless you’re using a tool like Wireshark). And I know you mentioned updating the server to 2.2.1, but is it possible some of your clients are still using 2.1.0? It could be that these clients are hitting the bug. If that is indeed the problem, ensuring all the clients are updated would resolve the issue.

This depends on the number of concurrent players you expect in a session, how active they will be, and how much data there is to synchronize. I’ve seen users use values anywhere between 64 and tens of thousands. For what it’s worth, we default to 512 nowadays as we found that’s a good value for a lot of types of games.

If you want a better feel for the consequences of increasing this number, the main impact this has on your application is on memory usage. Any unit of that queue size ends up generating roughly 4KB of memory allocations. So setting the value to 512 means allocating about 2MB of memory. On server hardware, even a value in the thousands will be relatively insignificant.

1 Like

Thank you for the information. Your help has been extremely valuable.

This may have been the case. After pushing updates out to ensure all clients are on Transport 2.2.1 about 24 hours ago we haven’t seen the message “Receive queue is full, some packets could be dropped, consider increase its size (256).” in our logs.

It should be noted that we also increased the “max packet queue size” from 256 to 1024.

All issues related to this topic appear to have now been resolved.

1 Like