I’ve been investigating a freeze in my game that occurs when I switch between two levels. In doing this, I first unload the old level, then additively load the new level. This is done for several reasons, including some stuff to do with networking which I won’t go into.
Eventually I’ve been able to get a test setup with a pretty much empty project with 2 pretty much empty scenes, only with both of them having a large amount of lightprobes. (LightingData.asset is about 34 MB)
Profiling this using both “Additive” & “Single” LoadSceneModes I observed that there was a massive difference, with the “Additive” method getting a huge spike caused by whatever “PostLoadSceneStaticLightmapSettings” is doing, and the spike being absent when using the “Single” method. (see screenshot)
I’ve also sent this as a bug report to Unity, but am curious of anyone else has also run into this and may have found a reason and/or workaround to it?
Hi @kristijonas_unity Can you please confirm the bug report has reached your team? It is a major problem for us and wouldn’t want this to get lost in ambiguity
It still hasn’t. I suspect that it might have gotten stuck in the incoming bug reports queue. I’ll reach out to our customer QA directly tomorrow about this.
Wanted to provide you with a quick update on this. We’ve found out that the issue reproduces regardless LoadSceneMode method is called. Resulting performance spike is almost identical.
Our developer has taken a brief look at the code, and there seems to be nothing funky going on there at glance. They will keep on investigating further.
Thanks for the update! Its interesting that in my test case I was not getting the performance spike nearly as badly in Single LoadSceneMode. (though the test itself was not exactly identical)
I also did some more testing and found that after reducing the amount of lightprobes to about 2/3 (lightingdata from 34MB to 20MB) the performance spike went down exponentially. (from 1000ms to 4-5ms) I tested this in the editor and may have made a mistake since that seems very weird. Atm we’re working on a test reducing the light probes to compare the difference on our target device. (for which the spike currently is over 20000ms, which is why this is a major problem for us )
We’ve now done a bit more thorough testing and what I’ve said above is not true at all.
The amount of lightprobes is linearly proportional to the performance spike, meaning the more you have the higher the spike. Target device measurements:
50000 lightprobes: ~23000ms spike
25000 lightprobes: ~13787ms
10000 lightprobes: ~7072ms
This means that as you’ve noted there is most likely not something weird going on in your code in terms of a literal bug. However, this is obviously still a major problem in trying to asynchronously load a scene. (and obviously unacceptable in terms of the performance spike)
I’d like to know if it is possible for you to be able to offset this process to another thread, or at least spread the process out over multiple frames so as to not fully lock the main thread?
If not, it is basically impossible for us to use lightprobes at all for this device… which would be very unfortunate as it greatly affects the visual quality of the game.
Thanks for the bug report, I have looked into the root cause of what you have observed.
In your case you load a single scene at a time and this is usually an optimal case because then there is no need to recalculate tetrahedralization information, a simple memory copy operation is enough. I’m happy to report that we are in the process of backporting an optimisation from the next Unity release to 2021.3 which will roughly double the speed of this copy operation in your case.
However, it may be that double speed is not quite good enough for you, so you could use a trick which is not well-known. While it is not possible to do the memory copy in a background thread, it is possible to recalculate tetrahedralization information on a background thread by having a persistent scene loaded first, one with very few light probes, then load the real scene. The effect of this is that tetrahedralization information must be calculated after loading the second scene, this can be done asynchronously with LightProbes.TetrahedralizeAsync.
There will, however, still be a small spike left since some copying will still take place on the main thread, so going forward you should consider breaking up the scene into multiple smaller scenes, each with a much lower number of light probes. This is often a good practice.
Also, you could consider reducing the density of the light probes network. While it’s convenient to generate a dense network it can be too resource intensive on low-end devices. In the sample project you provided, the tetrahedralization information alone took up about 30 MB of memory. Selectively positioning light probes as described in the documentation would require manual work but could improve performance and results considerably.
In short,
A performance fix will provide nearly a 2x boost in terms of performance. The fix is being backported to 2021.3.
Create a scene and place a single light probe group into it. This will act as a persistent reference for light probe data. Additively load other scene(s) containing light probes and re-tetrahedralize asynchronously.
Split the light probe network into smaller chunks, and place them into their own separate scenes. Additively load them in runtime.
Consider reducing the density of the light probe network. Perhaps most probes are not really needed.
Hi belgaardunity, thanks to looking into the problem.
Whilst the performance improvement is nice, it indeed will not fix the problem for us because the spike would still be way too large, even if we reduce the amount of lightprobes significantly.
I tried to test your suggestion RE: a persistent scene reference, but running various tests I wasn’t able to make this work without keeping that scene as the active scene forever. Is this expected? I’d also like to make sure I understand: the persistent scene can also be loaded additvely, correct?
For reference, the order of operation would be:
Load the persistent scene with 1 light probe (additive+async)
Set persistent scene as active
Load the real level scene (additive+async)
Set the real level scene as active
— level switch —
Unload the old real level scene (async)
Load the new real level scene (additive+async)
Set the new real level scene active
And for completeness sake: We were already using Tetrahedralize in our loading process, just not starting with a “dummy scene” as a reference point.
If the requirement is that the “persistent” scene cannot be loaded additively and requires single load mode, this is full-stop not an option. If the “persistent” scene CAN be loaded additively, but must be the ActiveScene “forever” this would require significant reworks in a lot of our code, because the active scene for instance also spawns gameobjects in that scene.
Splitting up a level in multiple scenes also brings significant other feature limitations, workflow limitations and loading performance issues in working with Unity, which is why this really isn’t an option for us either…
Finally whilst reducing the light probes is possible (also sacrificing detail/quality), we won’t be able to get it to a point where this isn’t a problem for us. 10000 lightprobes, which is 1/5th of our original setup still gives a 7000ms spike, whereas anything over 5000ms really isn’t acceptable.
I’d like to understand the spike itself a bit more as well: What exactly causes the spike?
A. Loading the 30MB lightprobes from disk into memory?
I can’t imagine this is the problem as loading from disk is easy to do on either another thread or over multiple frames rather than in a single frame/operation.
B. Merging/Unmerging lightprobes between 2 scenes?
I don’t want to do this at all, my levels are completely unrelated from one another! Is there no way to disable this and force unloading & loading as if it is a new scene? (so A.)
C. Tetrahedralization? <something else?>
I doubt this is the problem as well, because I was already doing this operation and could see in the profiler this is not where the spike was.
D. Something else???
As noted, I may be missing the entire reason why the spike exists in the first place.
The trick I mentioned in my second bullet point is meant to avoid a memory-to-memory copy on the main thread (which would ensure that you do not have to call Tetrahedralize). There is no need to make the persistent scene active, but you need to bake it for the trick to work. In terms of your sample project, try to load the persistent scene along with your existing startingscene in the hierarchy, then enter play mode and run your script. You will notice that the spike is significantly smaller (but you need to call Tetrahedralize before the lighting will look right).
I hope this works for you.
Here are some specifics related to your questions,
A. Loading the 30MB lightprobes from disk into memory?
It’s a memory-to-memory copy, it’s done as fast as C++ memcpy can do it (and that’s fast). Unfortunately, with the current code structure this cannot be done safely in a background thread.
B. Merging/Unmerging lightprobes between 2 scenes?
In your sample project, you already unload, then load, so you only have a single scene loaded at a time. That’s often optimal, but in your case you have massive light probes data in a single scene and that is why you see a spike.
C. Tetrahedralization? <something else?>
Explicit tetrahedralization is not needed when you have but a single scene loaded at any given time. You can see this by not receiving any LightProbes.needsRetetrahedralization events. With the trick I mentioned you will need to explicitly re-tetrahedralize after loading your real scene.
D. Something else???
I hope the above explanation helps.
I’ve just tested your suggestion again, now using the sample project I’ve sent as you mentioned. The trick reduces the spike loading from “startingscene” to “Scene_A” (16ms to 8ms), but NOT “Scene_A” to “Scene_B” that occurs afterwards. (still ~1000ms)
Just to confirm I’m not crazy, I swapped loading A & B and got the same results in order, the second load is at 1000ms in editor.
Here is a link to the updated sample that now includes the dummy scene. Press T to include loading dummy scene, Press Y to exclude loading dummy scene. Adding the dummy scene before entering play mode (and then pressing Y) made no difference for me either. https://drive.google.com/file/d/1glaUixaAS-GvzGkw0EHO1sbGhRna0ZBa/view?usp=sharing
I’m still confused about the spike. Scene_A has 30MB lightprobes, but so does Scene_B, why is it not a problem when I load the first scene? (to back that up, surely it doesn’t take over 1 second on a PC to copy 30mb of data in memory?)
Hmm, now I’m confused.
First of all, a memcpy of 30 MB will not take a second on a PC. In order to measure it on my i9 PC I added more profiling info and ran a debug build in the profiler, so in effect a much slower Unity editor, the copy took around 12 ms.
Secondly, I tried your suggestion, appended your dummy scene in the hierarchy, entered play mode and pressed Y. I could not reproduce anything like your 1 second spike, it does not show up in the profiler? I used the official 2021.3.5f1 LTS for this. Could you show a screen shot of what you see in the profiler?
edit: also added a zip with profiler data (note: I shortend the time between loads in my script to get a smaller profiler snapshot so I could upload this ^^)
Update: I’ve now also tried this in 2021.3.5f1 LTS and am also NOT getting this spike, interesting! Will try some more versions (and also on the main game) and get back to you.
In my main project I’m getting the following results after upgrading to 2021.3.5f1:
Without a dummyscene, the problem still occurs consistently on 2nd level load
With a dummy scene, the problem did not occur on 2nd level load
This is different from what we’re seeing in the samplescene, which worries me because we don’t know what is the cause for the spike. While it didn’t occur in a specific test I did, there is no way of knowing it won’t come back through circumstances we’re unaware of even with the “trick”.
I’ll need to do more thorough testing of this in both the sample project and my main project, but would really appreciate it if you would confirm the issue also occurs for you in the sample project with an older LTS version (like 2021.3.4f1) and if so, if you would be able to find out the cause of the massive spike. (and potentially the reason why that doesn’t occur in the sample project with .5f1) This way, there isn’t some mysterious/magical vanishing of the issue that could reappear but instead we have concrete evidence.
After running some testing since the upgrade, I can confirm the issue still occurs in my main game despite the upgrade even with use of the dummy scene. There was a bug in the code of making the dummy lightprobe scene where the lightprobes of the real level weren’t being used at all (and thus not causing the spike).
This means this is still an active showstopper for us unfortunately.