I always ran/run performance tests with VSync turned off. Is this perhaps a reason why I see the Incremental GC cost?
Yes, thank you! The project attached to the bug-report was submitted with Unity 2017 a few years ago. If it helps, I can also submit it again with 2021.1 and the options I used for the latest test.
Otherwise here are the Player Settings that I used for the latest tests . Create a Win64 build with and without Incremental GC to see the difference. Also make sure to run it on similar hardware that I used to submit the bug-report (Case 1108597). You most likely canāt reproduce it on higher-end hardware.
If Vsync is disabled, targetFrameRate is not used, and you are running on a platform, which does not have any platform specific frame timing, then the incremental GC will always assume it can use up to 1 millisecond per frame for itās work (as it has nothing to use for a better estimate). So, yes, it is possible that what you are seeing is the incremental GC doing its work. A profiler sample would tell. But it should not need this time continuously, only when collection needs to happen. But I guess you are only sampling a few frames? Maybe in the non-incremental case, you have one big collection happening (before you start taking time samples), and in the incremental case, that time is spread over the first X frames, which you are seeing? What if you wait some time? Will you still see a difference? Again, comparing profiler output graphs could probably tell.
I have not been able to load your project into a newer version of Unity easily (did not spend much time on it yet, either). If you can submit an updated version, that would indeed be helpful.
Itās more than a few. The 20 samples you see in the image below represent 20 seconds, where each sample is the average frame-time over the last second.
If the game runs at 200 frames per seconds (200fps = (1/5ms)1000), that would be 4000=20020 samples in total.
Image
I repeated these tests three times and used the minimum average frame-time in the graph, this should remove a lot of noise.
Since the test goes over 20 seconds and the average frame-time gets reset each second, I donāt think a single GC spike at the beginning would explain the entire test to be slower.
for what its worth, this thread and associated work having tracked down the performance issue to incremental GC is making me think I might make the step to migrate from 2018.4 to a later version at some point. Thanks again Peter!
This shouldnāt stop you from upgrading. After all, you can turn it off when you notice it affects your game negatively and then itās the same behavior as in 2018.4.
Thanks. I took a look at that project - but on my machine, I could not see any difference between builds with and without incremental GC. Now, my frame times are generally higher than yours, and the project seems to be GPU bound on my computer (a 2017 MacBook Pro) - so it would be possible that performance differences are masked by the time spent waiting for the GPU.
But: I also looked at samples from both builds using the profiler. I looked for two things where Iād suspect incremental GC to cause a difference.
incremental collection taking time. But I did not see any time spent in the GC (it does not seem to kick in in either incremental or non-incremental mode, as the project seems to barely allocate any memory at runtime).
differences in time spent in scripts. In incremental GC mode, scripts are compiled with different settings (the write barriers mentioned above), which can cause some overhead. But I could not see any difference in time spent in script code in either of the two builds.
So - Iām not doubting your findings, but I cannot reproduce them. Can you get Unity Profiler graphs showing the differences in frame times? That would be very useful here.
GPU bound? The test runs (should run) at 320x240 resolution to avoid being GPU bound, Iām more than surprised. Did you run it on Windows or OSX?
Iāve uploaded Unity profiling data recorded with the Profiler.enableBinaryLog functionality, please see:
(Case 1300171) 2021.1: Follow-up on (Case 1300104) āIncremental GCā cost
Here are some screenshots so everybody can follow alongā¦
No Incremental GC
With Incremental GC
Comparison
Please note that the Unity profiling data recording itself costs performance too, so these numbers donāt necessarily match the graphs posted earlier.
PS: The profiler data captures seem to contain āCPU Usageā only, all memory alloc etc seem to be missing? Iāve asked whether this is by design here: https://discussions.unity.com/t/821014
On a MacBook running OSX. I donāt think the actual rendering was taking much time, but the blitting took some ms (possibly upscaling to the retina display?). Not much, but enough to mask the small differences in the rest of the frame (which is also very fast).
Thanks. This is useful.
So, what I see here: You have uploaded 4 different data sets, each with and without incremental GC. In each case, the non-incremental version is faster, but never by as much as the number sets you originally posted.
Differences in frame times I see for the 4 data sets (unless Iām interpreting them wrong?): 0.04ms, 0.09ms, 0.19ms, 0.2ms. Drilling down, the difference seems to be completely in managed code. That is in line with the overhead from incremental gc requiring managed code to be generated with write barriers. And in this scope, I think the difference is not unexpected - there is some cost related to incremental GC (or any more modern GC for that matter) for the write barriers.
But what is strange is that the numbers you are seeing without profiling show higher differences. I have no explanation for this atm.
I donāt know this (Iām not involved directly with the profiler development). But just FWIW - I donāt think that allocations play a role here. It does not seem to be the collecting part of the GC which is taking time here (there is barely any GC happening here), but the overhead from tracking references for later collection.
Btw, thanks for sharing all this data. This is very useful (even though I may not have any answers on how to improve this, other than turning off incremental GC for your project, or optimizing the managed code which is slower to change fewer references in heap objects).
Isnāt this an issue with all monobehaviour and not just UI? I think in the past setactive always had garbage and caused some fps spike too, was something improved already?
Itās not exclusive to UI, but the UI Components were a trivial case to report for me. Not every UI Component causes gcalloc btw.
Itās not specific to the MonoBehaviour/Component base classes, but to the specific implementation of that MonoBehaviour/Component. Not every MonoBehaviour/Component causes gcalloc.
I followed this threads about comparing versions for a longer time.
(I know Peters tests prevent GPU loads)
For my project I wrote a win based tool to make changes to all unity GPU core scripts at once. First, to make sure that there are no incompatibilities when updating GPU core scripts to a newer version. And for sure to have all changes in any Unity GPU core Script outside of unity itself, and pass changes into any version at any time with one click.
This thread reminds me to compare the GPU versions 2018.2.xx with 2020.x. based on all the shader & core stuff.
Further. I can definitely remember when comparing the first time Net4.x beta vs Net.3x all the scripting stuff was about 30%-70% faster on Net4.x in every situationā¦ but after about 2 - 3 builds later the speed advantage was completely gone. That was a bit annoying, because the NET team shouted āitās so much fasterā permanently.
After discovering the issue with Incremental GC. Is the mobile performance of 2019/2020 still considered poor compared to 2018? I want to upgrade my project to 2020 to utilize visual shader tools but only if I can get as good of performance on old devices as I do on 2018.
Unity 2018 ran perfectly for me. Unity 2021 makes my game run like crap. Huge FPS loss to where Iām finding myself micro optimizing just to get that performance back. From terrible lighting performance to terrible particle system performance to annoying popup windows for every little thing I touch, click etc. Why do I have to perform lighting hacks of turning lights on and off to not blow my triangle counts out of proportion when it never used to happen? Why do I have to adjust my particles to be what I donāt want, lower the resolution etc. to squeeze back performance I had in 2018? I could seriously write a huge list of issues I have with Unity 2021 vs using Unity 2018 which was excellent. Iām almost afraid to make any changes as I feel those changes will be deprecated in a few months and the cycle starts all over again.
Seems like backwards progression if Iām being honest.
If youāre interested in an old mono fork vs new mono fork performance comparison, donāt look any further, itās here: Unity Future .NET Development Status page-4#post-7378859
Very true, i see the same with 2018 being the exponentially faster version, 2019 is still passable and anything after that is like watch a slide show of popup windows.
Seems Unity 2020+ needs a monster PC, while 2018 run lightning fast on my 4 years old laptop.
The difference is so big that feels like another software altogether at this point.