Memory Garbage Collection Issue

Hey community!

Our application has to run for days or even weeks in worst case without issues (memory and performance).
In most cases we at least have once per day the possibility to restart our unity application.
While doing tests around the restarting our testing team found a really strange behaviour that we are currently investigating.

Once out of like 50 times it can happen that we run into a memory issue.
The strange thing is, in those runs the memory is continuously increasing directly form the startup even without doing anything on the application itself.

To give a little context to the project:
Unity is the Client Part that connects to multiple server instances (C# applications).
The server instances running basically our game flows and unity visualize it and let the user interact with the game.

So also with no interactions to the application there is still stuff going on.
It runs then for around 2 hours continuesly eating more memory until it finally runs out of memory.

I managed once to catch a memory dump with the memory profiler that showed the following output:
(Snapshot A is the game with the issue running for about 1h, Snapshot B is the same game, with the same setup/hardware running already more than 4-5h)

So it seems like that we have an issue with heap fragmentation. Just sometimes?

After analyzing the memory dump without finding anything that looks unusal in comparison to the normal running dump we did some investigations about the Garbage Collector.

We had alread from previous games some tooling in place that shows how often GC.Collect was called (basically GC.CollectionCount(0), (1) and (2)) and we found out that for those runs where the game starts using more and more memory the result of GC.CollectionCount(0) just stops increasing at around 40 collections? (Currently testing if thats also the case for Count(1) and Count(2))

We also already tried to manually call GC.Collect in case we notice that the memory increases but the collection count stays. But even when manually called, the count does not increase anymore and the memory usage continuesly grows.

I really don’t know what i sould think about that so i’m hoping to get some input or ideas what i could start looking into or if maybe someone also experienced once an issue like that.

Some Facts:

  • Unity 2020.3 LTS
  • Incremental Garbage Collector → deactivated
  • Build Target: Windows
  1. Small Note: I think in your explanation you mixed up snapshot A and B, as A (the top one) shows more memory being used
  2. You might want to open those snapshots in 1.1.1 using an empty 2022 project just to confirm that it’s not a bug in the Memory Profiler Package’s crawler of the managed heap data that made it ignore some references and thereby unable to find objects only referenced that way (i.e. references held via struct fields)
  3. Unity’s Boehm GC is non generational so checking GC.CollectionCount for other generations is not gonna reveal anything.
  4. Pre Unity 6, every 6th GC.Collect entirely empty managed heap pages (4KB blocks) would be “unmapped” just to, on most platforms(*), be immediately remapped but without reporting that to the Memory Profiler, resulting in a growth of untracked memory. I’m in the process of backporting the fix for it being untracked to 2022 but it likely won’t go further back. So just to be aware of this:
  5. calling GC.Collect manually might reduce the apparent “empty fragmented heap space” while not really reducing the committed memory amount, just hiding it
  6. If it doesn’t reduce this way, your fragmenting it at rates where no 4KB block ever gets to be entirely empty, so you might want to chase down any extra allocations between snapshots (or generally GC Alloc events in the CPU Profiler) and see if you can reduce them to avoid fragmentation to get away from you this badly
  7. Generally, you should never have to call GC.Collect, and doing so is generally a bad practice. It’s automatically triggered when a new allocation needs more space than currently available on the heap.
  8. (*) “Most platforms” means this does not include Windows and PS3 and as of latest patch versions of 2021.3+ LTS releases, Linux neither.
1 Like

First of all - thanks for the fast reply!
Will discuss your input today with my team.

To clarify/give you some more information to some of the points:

  1. Actually i haven’t mixed them up. But propably my explaination is a bit missleading.
    Both snapshots are from the same Build (executable) on the same machine with the same setup.
    But they are from different runs (starts) of the application.

    Snapshot A is the 1 of the 50 runs where we have actually the memory issue
    Snapshot B is from the 49 times where we did not have the memory issue.

    In both runs we have simmilar workload, so both runs should allocate the same amount of objects.
    But only sometimes we run into the memory issue. And the weird thing here is that we know already
    after a few minutes if the run is a ‘bad’ one.
    In any ‘normal’ run the application can even run for multiple days without having a memory issue.

  2. I forgot to mention at my initial post that we are running on a windows evironment, will update also my initial post.

I will keep you updated with any further infos from our side.

Ah ok that makes sense now in regards to the ordering of snapshots…

As a small update from my side regarding my previous point 4:
I’ve found out Unity 6 seems to still have these untracked entirely empty “unmapped” managed heap pages (at least on Windows Standalone), it just reports them now before they get unmapped, but not after. Still something that needs backporting (i.e. that older versions did not properly report entirely empty pages before they got unmapped) but obviously not something that solves the entire problem.

What that means for your capture is also that that fragmented heap space is structured such that each page size (4KB mostly) block still has at least one live object in it. So addressing mixtures of big and large, short and long lived managed allocations could still be a way of reducing this growth of fragmented memory.