I run the executable of “CSharpTest.zip” from the above post, using Unity’s mono environment, instead of the default environment of Windows. (The codes are slightly modified, so the time consumption is not the same as the result in the above post, but it’s not important.)
As you can see, the time consumptions of the two are very different, mono.exe is as fast as Windows’ default environment, while mono-bdwgc.exe is as slow as running these codes in Unity.
So I guess if I can switch the Unity’s C# environment from mono-bdwgc.exe to mono.exe, the multi-threaded performance will increase very much.
I’m betting bdwgc is for “Boehm-Demers-Weiser Garbage Collection”.
Why it would be that much slower at thread handling is not something I would necessarily know. It also doesn’t help that I haven’t looked at your code (at work, can’t be opening your source). But that is a HUGE difference, GC couldn’t be the only cause (unless you for some reason had some really GC intensive code… are you resizing arrays a lot or something?)
As for achieving the switch…
Well… I don’t know how to officially do it.
BUT, if you aren’t afraid of hosing your install and possibly having an unstable environment… you could just rename mono.exe to mono-bdwgc.exe (or vice versa, not sure exactly which you want to replace to).
Chances are, you are looking at the problem from wrong angle.
You need first look into data format of save / load. Is a string json, XML, is it compressed, or is stored as different format. String will be probably the slowest one.
Then need look, how you process your data. Can be anything optimized. Can you remove unnecessary bits.
Saving / load for duration of 10 sec is not big deal.
Alternatively, you can consider loading asynchronous in background.
I’m trying to solve this problem from every angle I realized.
But the truth is, the game’s codes are very complicated, I can’t modify the structure of game data in a short time.
And I can’t switch the format of the saved data to binary for some reason I can’t control.
So multi-threading is the fastest solution to improve the performance, of course I’m working on other solutions too.
Just because there’s nothing can be said about those solutions, so I didn’t mention them here.
There’s a bunch of things you should to check here:
Are you running this in a build? That might give better/different results.
If it’s GC issues you’re running into, you could try the incremental GC .
With regards to your other thread on the same(ish) issue, remember that Unity’s using some threads as well. Try to have the same amount of threads running as there’s available logical cores, with Unity running on the main thread taken into consideration. Also realize that there is some inherent overhead to running as a part of a game engine. You’ll always be slower doing something very simple when there’s an engine running at the same time as if not. The results you’re getting in the other thread is excessive, though.
Yes, based on my limited knowledge of the Boehm GC. The Boehm GC is not designed for multi-threaded apps meaning that it can only allocate or deallocate on a single thread at a time, and it will lock the other threads when it needs to do these tasks as well as scan memory.
For more information check the scalability section of the Boehm GC website linked below.
We’ve identfied some issues in the other thread as well, and based on that finding about the GC it all suddenly seems to make sense. For example, the current implementation allocates a ton of arrays.
Looking forward to see the performance with the improvements that are suggested in all the threads you’ve opened (pun intended :p).
I compiled mono.dll myself, with “SGen GC” instead of “Boehm-Demers-Weiser GC”.
(search “github Unity-Technologies/mono”, I can’t paste the URL…)
Then renamed mono-2.0-sgen.dll to mono-2.0-bdwgc.dll, and replaced the original file (replaced MonoPosixHelper.dll at the same time).
In the test project, the running speed became as fast as the default environment.
But in the real game, it crashed before the main window opended.
So I’m sure that the problem is “Boehm-Demers-Weiser GC”, but the solution to the problem is not simply to replace the garbage collector.
And I tried the incremental GC, it will slightly increase the performance, but toally not enough.