Unity folks, Network Heartbeats...

How often is the server running HeartBeat() and how often is the client sending sends back to the server? Is this happening every game loop or on a timed interval?

Reason why I am asking is that this takes up bandwidth in an exponential way, regular heartbeat checks should happen on all loops but heartbeat sends should happen only like ever 5 to 10 seconds max on idle time just to tell the server the client is still alive, but if the clients are sending a blank hearbeat every game loop, this needs to be changed asap because it is killing bandwidth from the client to the server and causing issues on the server side when it comes to monthly bandwidth ussage that is paid for to the hosting company.

Found mega networking problem, when Unity looses focus, the heartbeat stops… HUGE problem, I mean HUGE problem, did I mention HUGE problem?

Every single game loop, at the start of the loop, a network heartbeat MUST be triggered, absolutely MUST BE TRIGGERED. This states that the Unity networking is not in its own thread, instead it is a child of the single threaded model of Unity… Let me go back to the statement, VERY HUGE PROBLEM.

On top of that, I have yet to detect the triggered hearbeatsend command from the client when running as a client for an empty echo heartbeat. Not so huge of a problem but still a major problem. Guys, come on, we users have absolutely zero power here on this, I do mean zero power. Unity has this coded internal as part of the engine, Even though we can do a server create, we do not control the heartbeats, you do, this is number 1 on Raknet ‘A MUST DO’ list.

This has to go in 2.6, or a 2.5.2, this is a HUGE problem! Either that or add the heartbeat commands to our list of commands on the base Network class you have. I’ll go back to the major statement of HUGE PROBLEM…

This is part of the RakPeerInterface and it MUST exist and it MUST be properly used. This is pretty much the control of checking for and acting upon packets.

I would have never ran across this if I hadn’t started to help people today with the internal network configuration in unity based on RakNet. You have overlooked a completely major component of raknet, and I do mean major.

The application has to have a background thread that Raknet runs on that is seperate from the main thread so that you can loose focus on the main application and not loose raknet from running. (I’ll now reffer to huge problem). When the client on the same machine in testing, connects to the server on the same machine for testing, the LOG does not show the connection as long as the client has focus, when the client looses focus and the server gains focus, the log then shows that the client connected, AT THAT POINT the client is reset because the thread has frozen state, so when the client gains focus again, its current PORT is 0 instead of its connected port to the server.

HUGE problem.
So I can now hear people go "so what he is either stupid or nuts*, as they close this from there view and move on about their day, but let me say this, this is a HUGE problem, I seriously can’t stress that enough. This is what causes the problem with the server eventually timing out or the clients arbitrarily disconnecting for no reason.

Unity has utterly ignored the request time and time again to allow us control over the game loop, rather completely refused it, which is fine and your deal, but if you do this and have the Raknet wrapped around the game loop outside of our control, you MUST and I do mean absolutely MUST have a handle on that heartbeat, both for the client and the server. At this point though, you don’t. Seriously, there are zero network commands for us to send a periodic ping to the server, or to have the client send the server a heartbeat request, or for the server to control the general heartbeat.

I had sent in a ticket almost a week ago now with a project that uses Raknet as a .Net assembly, in that project is a client setup and a server setup within the core interface that uses the Heartbeat commands. In essence in the purest form of Raknet, this is the RakPeer Packet Receiver:

	/// Gets a message from the incoming message queue.
	/// Use DeallocatePacket() to deallocate the message after you are done with it.
	/// User-thread functions, such as RPC calls and the plugin function PluginInterface::Update occur here.
	/// \return 0 if no packets are waiting to be handled, otherwise a pointer to a packet.
	/// \note COMMON MISTAKE: Be sure to call this in a loop, once per game tick, until it returns 0. If you only process one packet per game tick they will buffer up.
	/// sa RakNetTypes.h contains struct Packet
	virtual Packet* Receive( void )=0;

Which is the same as the Heartbeat in the .Net wrapper assembly. Please note the method definition, and pay very close attention to the “COMMON MISTAKE:” This is what is happening in Unity right now, the “COMMON MISTAKE”…

Since when focus to the Unity game is lost, the game is more or less in a sleep() state, guess what doesn’t get called? When the server is running and chugging along, and clients start to arbitrarly start to disconnect, what is not being called? This…

I saw this happening and I was like “OMG” no way… I am a complete bafoon and I ran across this. I am the most hyper individual on the planet, and cought this. People have been complaining of client disconnects and server issues since 2.1 and I bet you odds are this is the problem child. If the buffer of commands gets so far behind, the server will just puke, if the client looses focus and sleeps the network interface, it will drop from the server, if the server has anything that takes presidence and sleeps the unity thread to long, the server will barf and die.

This really has to be fixed by the next release.
Seriously.

project settings->run in back ground

seriously

That is not the fix.
This is a core UNITY engine bug. (seriously)

enabling the application to run in background through the corresponding script command will prevent that the application or raknet stop running.
This is a MAJOR MUST DO, I repeat, a MAJOR MUST DO, if you work with Unity and want to make it capable to run in the background or run as windowed application to allow chatting etc outside.

I’ve never lost connection or anything with Unity, neither on 2.1 nor 2.5, neither on Windows nor OSX, when it was enabled to run in background and I commonly have 3 - 5 client windows open at a time to do the testing, so 2 - 4 of them are inactive

If you disable the world update by not enabling it to run in background, then at least I would expect it to stop everything not just the world simulation as the similation, especially the interpolation, would completely explode the moment the window becomes active again as the network is world ahead.

That beeing said, if the networking really runs in the main thread (which I am not fully sure yet), then I would welcome its move into an own thread as well. Not for the non-reason here, but to lighten the cpu load on the main thread to make the technology more world simulation node capable for multiplayer games.

Seeing my ALREADY HAVE DONE, then you will see that Unity is NOT DOING THE MAJOR NEEDED DONE. Along with the fact I was also painfully aware someone would be utterly clueless as to how the project is setup since they think that I do not have the thing setup to run in the background (asside from the fact I don’t have the unity package here in the threads because of a core Unity posting issue which I have a ticket open for and I can’t post projects, rars, zips, or anything other than simple few K script files), I bothered to setup a video as physical proof when then nullifies that arguement.

I took the time to do a JING video as complete proof that this is a core Unity issue. The script that I used as a test was the script I was helping the other person with on the other thread. I will attach that here also.

If you would like to watch the video:
http://screencast.com/t/t7CoowSO2

And sheer proof that the client data is getting dumped due to improper use of network calls outside of our control. Had this been the proper use, the myPlayer object would not get set back to null after a few game loops (which happen in nanoseconds of time).

Now if you look at the code, explain to me how myPlayer is getting set back to null unless the Network.player is loosing information. Obviously myPlayer is a reference to this “live” object which should remain live until told otherwise.

178105–6368–$networktest_150.js (2.57 KB)

There are two major problems in your assumptions:

  1. You run it in the editor and assume the thread to not run in the background. Thats something that can’t happen, the editor stuff runs always, independent if you set the project to run in background or not

  2. The network connection remains valid.
    The server does not get the onplayerdisconnected callback till you close the client player, which means that up to that point the connection was valid, otherwise the client couldn’t terminate it to notify the server of leaving. if it was terminated before, it would result in a disconnect call earlier.

What is beeing reset is the data you get from myPlayer, a NetworkPlayer you cached before.
If you replace myPlayer with Network.player it works all fine, even if both players are in background.

I’m having both windows open on my second screen while writting this here.

This is Unity 2.5, Windows Vista 64 Business

Only issue with both of these are that if you watch the video first I show you a server running and a client running, then I show you a server running in the editor with the client running outside the editor, I covered both grounds. myPlayer should not loose its information in either case, but it is, it is a pointer to a referenced object which is a pointer to the active Network.player so if it still existed, the data would still be valid, but it is not.

Watch my video again to see that I run the server and the client first independent of the Unity editor before I run the Unity editor as the server to show the debug. There are issues here and we need direct access to be able to call the peer receive ourself.

As mentioned, I tested it in both scenarios:

  1. host in editor, client standalone
  2. client and host standalone.

Network.player is always valid. The connection is not terminated.

Neither when host is in background nor when client is.
I had both running for hours in the background while doing other things without problems.
No reset of data, no disconnect.

I would have been worried if it just started to have problems in that scenario as my code has always worked and I always test it locally with various client windows inactive.

The only problem, and thats significantly different from what the thread is about and tries to imply, is that your myPlayer reference points into nowhere.
That basically means that the network player for your local player has changed at some point and I can’t answer the question to you why or when, so you best open a bug ticket for this one, as it is the only bug.

Interesting findings on your end, my machine has a client disconnect after about 18 minutes of sitting there. Unfortunately I do not have software that will sit there and video record the entire time as both server and client are running so that the results can be seen. However, the myPlayer never returns with a value once the pointer is destroyed (the unknown factor here that you can see and test), and ultimately, the more data that is cached up since the receiver is not programmed correctly and we have no direct access to it, means that the more data that is coming from more players ends up making the connections reset from time to time due to loss of to much information in queue.

We need a Network.receive() so that we can force a manual flush from time to time on our end, I have several bugs in queue that I am waiting on a response from, one that is network specific but not using the internal networking of Unity with its Raknet, however I have a SWIG version of raknet that is a C# assembly with the correct heartbeats setup that has issues since Unity just frankly dies. I created that ticket almost a week ago, as of this morning, not a peep about it, I seriously can’t believe that no one else is not experiencing network delays or network drops or even server resets from time to time, because that is what these issues I have mentioned will ultimately cause.

  1. server will stop listening
  2. clients will abruptly disconnect
  3. ports will reset on clients
  4. server will hang

Those are the issues you will see as more players start to join the game with the way Unity is configured at the moment, this is just going to happen.

What information in the queue are you talking about?
There is nothing left in the queue, RunInBackground will make the app run as if it was in foreground, processing all the data.
Disabling RunInBackground is only an option for fullscreen only games, a thing of the past, so the default optimally should be RunInBackground anyway.

I agree with you that more access and especially finally fully documented Unity RakNet informations to implement a C++ server for Unity games, would be a great thing.

As for Problems:
We have never had any reset problems and alike and the test server we had commonly ran up to a week in a row until we restarted it to for update reasons.

No drops, delays or anything.
The only problem we had back then (Unity 2.1, Windows build), due to the fact that it was a server box with no gpu and remote access, was the actual performance of the same, which is why we even started to consider other server backend technologies at all.

The only cases where I’ve heard of such problems so far were attempts of manual network instantiations where the code of the dev caused the actual problem

None of those things though will solve problems you create through external applications that try to talk to unity through RakNet.
The Unity RakNet implementation is not available unless you invest 5 figures+ from what has been mentioned, so without that you are basically blind guessing what it does, not only because Unity is at best RakNet 3.0 (thats a guess based on the master server codes, we don’t have more evidence than that), but also because Unity has an own modified and ported version of it as there was no RakNet for OSX at the time it was integrated in unity.

these two things together basically have a nice potential to have small to major problems when trying to create non unity entities talk to unity through networking.

What I am talking about with queue is message queue, messages sent to the server are not processed until the server calles its own receive method, on the client the same is true, no messages sent to that client are processed until the receive is called. The receive has to be called once per game loop on both the client and the server at a minimum. If you know you should be processing data more than that many times during that single game loop, you need to call receive a few more times in the appropriate times and places. Since we have no control over the receive, we can not add additional receive calls which we need to do.

As far as no “OSX” version available, sure there has always been a version available and instructions on how to build it, ok well not always, but as early as June 0f 2007, see

http://www.jenkinssoftware.com/raknet/forum/index.php?topic=1331.0

Later someone came out with a better way that I had to build an assembly file for OSX which is possibly what they are using. (the links are broken at the moment to those attachments on that thread, I might have to recreate them for Jenkins)

The latest way to build that assembly is located in the instructions that he has for building for OSX and it creates a .a file that you can drag into the plugins folder and reference.

I don’t see a reason to process the network income more than once per world simulation step.
I don’t gain anything from it, as it won’t be processed earlier anyway (next world simulation step)

The only case where you potentially would require more regular checks is in situations where you have many clients connected.
That is not such a problem, as Unity officially is targeted at scenarios with <= 100 clients and up to that point you shouldn’t have any problems with the way the networking works. If you need more to significantly more clients, then choosing Unity’s networking is just the wrong approach until it has a completely headless dedicated server.

As for OSX: and how old do you expect Unitys Network support to be? I hope you don’t assume that the networking was added “just recently” as it has been in at very least since 2.0 which is just as old as that post and definitely was in production more than a handfull of weeks :slight_smile:

How old? Newer than 1.6 since it didn’t exist at all in 1.6, and really wasn’t stabilized until 2.1 I don’t think, besides, the limitation of <=100 is a built in limitation of Unity which doesn’t need to exist at all. With Unity I know you can’t have a single client talking to multiple servers, not in their current configuration, but having only a single receive call is still bad since that will have limits on how many RPC calls you can make per game loop, literally. Those messages start to add up exponentially even for say 4 players to 20 players, in a tactical game, a combat unit movement is broadcast out, now lets say you have 100 combat units, that is 100 messages that is broadcast out just for that single scenario per game loop and depending on how much information per unit is sent, this can get hectic fast.

Now add lets say 20 people, with 100 units, in a broadcast, that is 2000 messages in a single game loop, at around lets just say 10 game loops per second (that is a short number BTW), so that comes out to be 20,000 messages in under a second, so the need to have more than one receive per loop comes into play.

If you still don’t agree with that, I am not sure where or how to explain the receiver.

That’s interesting. During my load-tests with up to around 80 players I did have a rather strange phenomenon that some RPC calls got severely delayed (a couple of seconds) while most went through just fine. I noticed this because some of my own prediction code got all weird (10+ seconds going straight in one direction might throw a player completely out of the level). So I added a check on the NetworkMessageInfo.timestamp difference and whenever this is more than a few seconds write into my log.

Unfortunately, this issue was somewhat hard to reproduce - but with this information (internal queues potentially adding up) it shouldn’t be too hard to set up some test-case (“headless clients” that you can instantiate manyfold on an average machine, spamming RPCs to the server and back in a well-defined manner … with “headless” I refer to clients that have the cameras switched off).

On the other hand, a definite official statement like “yeah, oops … we’re only doing a single receive” or “nope, that’s not the case” would probably be the much quicker way of figuring that out :wink:

Regarding the hearbeat issue: With the “standard Unity networking implementation” I don’t think connections can get lost because a game is running in the background. I did most of my tests with a couple of clients on each machine, so most of the clients were running in the background most of the time. And I don’t remember having significant disconnects from those test-clients.

Sunny regards,
Jashan

problem with RPCs is that they are reliably sent, if something goes wrong they are resent and resent and resent till they are fully correctly received.

if you throw that together with position updates, which are commonly low latency, you can get into pretty troublesome situations independent of the network solution and behavior.

thats why positions are commonly sent unreliable (and why UDP got the interest it has today at all), because you don’t give a crap anymore about the last position once the next position is received, you would just drop it if timestamp > lastUpdateOnThatObject || timestamp - lastUpdateOnThatObject > maxTimeThreshold

That sounds good to me :wink:

As timestamps are raising, naturally < not > for the first check

timestamp < lastUpdateOnThatObject || timestamp - lastUpdateOnThatObject > maxTimeThreshold

Sorry for the typo