My project contains ~250 EditMode tests across ~25 different files. The majority of the tests basically instantiate the entire GameManager and all other managers that come along with that. I’m more-or-less just starting the entire game, but without the overhead of PlayMode.
(It’s a 2D top-down simulator game if that’s relevant, similar to Prison Architect / One More Island / Another Brick in the Mall.)
Individual tests complety very quickly (between 0.01s and 0.3s), but if I use ‘run all’ the last test always takes very long (like ~30 seconds). Running this ‘last test’ individually shows it completes really fast just by itself.
What is causing this wrap-up to take so long, and/or how can I figure this out? I’ve tried:
Running the Unity Profiler. Without Deep Profiling on it managed to complete, and I can see the last frame took roughly 7 minutes, but there’s no real details explaining why. Running with Deep Profile causes the editor to crash, and if I run it on a smaller set of tests I can’t really seem to reproduce the issue.
Add some code in a [TearDown] method like the code below, but I couldn’t find any leak from the face of it.
[TearDown]
public void TearDownGameManager()
{
long preBytes = GC.GetTotalMemory(false);
int preObjects = UnityEngine.Object.FindObjectsOfType<UnityEngine.Object>().Length;
GameManager.TEST_ResetGameManager();
Resources.UnloadUnusedAssets();
GC.Collect();
long postBytes = GC.GetTotalMemory(false);
int postObjects = UnityEngine.Object.FindObjectsOfType<UnityEngine.Object>().Length;
Debug.Log($"Memory: {preBytes} -> {postBytes} ({postBytes - preBytes} saved)");
Debug.Log($"Objects: {preObjects} -> {postObjects} ({postObjects - preObjects} saved)");
}
If you have any pointers to likely causes, or suggestions how to debug this better, please let me know.
My hunch is that initializing a whole „GameManager“ for every (!) test is probably generating lots of garbage or other temporary resources that need cleaning up.
You could try MemoryProfiler to check on that.
Note that with the kind of game you are making I would expect the whole simulation to be testable without a single MonoBehaviour involved. With the exception of perhaps pathfinding/transforms. But it does sound like your unit tests aren‘t actually testing the smallest possible „unit“ but they are rather integration tests where each test will only work with a whole lot of other systems.
Thanks for the response. The majority is indeed integration tests, and not unit tests. We could defnitely benefit from more unit-tests (in addition), but I’m not sure if that means we have too many integration tests per se.
We could benefit from mocking probably - but I haven’t figured out how to get that to run properly. Suggestions there are appreciated as well ofc
Wrt your comment about MonoBehaviour, I actually don’t think there are MonoBehaviour scripts triggering - but I might just be misunderstanding the situation. I do load the scene, but the Controllers attached to game objects don’t get an Awake/Start/etc. So they don’t register callbacks on the model and don’t do other actions.
I expect you might be right on the “other temporary resources that need cleaning up” - but I can’t really find which ones. There is definitely a significant GC going on (roughly 29MB per test), but I’m not sure why that would lead to a long ‘wait time’ only for the last test.
I mean if each individual test would be slow that’d clearly be something I need to change, but the slowdown I’m experiencing only gets to an annoying level when running multiple tests in batch. It basically seems to increase exponentially, which is that part that surprises me.
If that’s for each of the 250 tests you’d be creating about 7 GB of garbage. Cleaning that up at the end of a test run (ie after all tests ran) will take some time for sure.
But even for a single test it sounds like a looooot of garbage. You should analyze a single test and identify the source of the garbage, and if it’s from your own scripts, fix that. Then check if this speeds up the end of the test run and if it does, you know the issue is with creating too much garbage.
Also updating to the latest patch level and test framework is worth considering, especially if the project lags behind many patch versions.
Thanks for your reply. As mentioned in the original post I’ve experimented with a [TearDown] method that calls GC.Collect after every test, to see if that would affect things. But it didn’t really.
Although it’s hard to disagree that reducing the GC is a good thing, to me it doesn’t yet explain what exactly accumulates during all those tests - as I would expect that my GC.Collect takes care of GC. But it’s perfectly possible I’m just misunderstanding how it works.
I have analyzed a single test and have yet to investigate what those calls are exactly. But it doesn’t (directly?) seem to be caused by my own scripts.
It seems like TestRunner is creating all this garbage when filtering assemblies. I suppose it’s using LINQ a lot and creates copies of collections. Still seems awfully much even for an editor tool.
Also check for TestRunner package updates. There’s even a new “v2.x.x” which isn’t officially listed yet.
I’ve updated TestRunner from 1.1.33 to 1.4.5 (it wasn’t showing up in Package Manager as an update ), which caused some failing tests because I think the scene which is used (something isolated) might differ from the older version.
That led me to finding a couple references from model-code to Controller Singletons (which are attached to gameobjects), and I expect that led to a lot of stuff happening with the Scene that’s not needed for these tests. It didn’t show in the Profiler, but removing these references (which shouldn’t have worked like that ofc) made a significant difference.
Time dropped from ~2m (including ~30s teardown time) to roughly ~25s. GC in the profiler also dropped to like 20kb.
Finally, I compared the speed of 1.1.33 with these fixes vs 1.4.5 with these fixes. That dropped an additional ~5s, so the newer version optimized some things under the hood.