Help! I can’t seem to remove a loaded model from memory!

Hi Everyone,

So, I think either I’m doing something very wrong, or I fundamentally misunderstand how unity manages memory, in either case I’d really value your insight.

My goal is very simple, I want to:

  1. Snapshot ram usage on an empty scene
  2. Load an additive scene with a textured FBX in it
  3. Unload the additive scene and FBX
  4. Snapshot ram usage again and, critically, have RAM usage be returned to around about the same level as it was in step 1.

However, I can’t for the life of me manage to achieve this simple goal.

My Setup

Hardware:

Up to date Windows 10 machine, Intel Processor, Nvidia graphics card.

Software:

Unity 6000.0.25f1, Built-In Render Pipeline
Memory Profiler 1.1.3 (com.unity.memoryprofiler)

Scenes:

RunThisScene – My starting scene, just a camera and my “RunThis” script.
ModelScene1 – Just the FBX (textured) and a directional light
EmptyScene 1 – Completely empty, just exists to test the statement that assets unload with a one scene delay from this thread: What on earth is "Reserved" memory and why is killing me? - #18 by Ikaro881

Script:

using System;
using System.Collections;
using System.IO;
using System.Threading.Tasks;
using UnityEngine;
using Unity.Profiling.Memory;
using UnityEngine.SceneManagement;

public class RunThis : MonoBehaviour {

	const string MODEL_SCENE_1 = "ModelScene1";

	const string EMPTY_SCENE_1 = "EmptyScene1";

	void Start() {
		Demonstration();
 }

	async void Demonstration () {

		await Task.Delay(2000);

		await SaveMemorySnapshot(1);
		
		await Task.Delay(2000);

		await LoadScene(MODEL_SCENE_1);

		await Task.Delay(5000);

		await UnloadScene(MODEL_SCENE_1);

		await Task.Delay(2000);

		await ForcedMemoryCleanup();

		await Task.Delay(2000);
		
		await SaveMemorySnapshot(2);

		await Task.Delay(2000);

		await LoadScene(EMPTY_SCENE_1);

		await Task.Delay(5000);

		await UnloadScene(EMPTY_SCENE_1);

		await Task.Delay(2000);

		await ForcedMemoryCleanup();

		await Task.Delay(2000);

		await SaveMemorySnapshot(3);
	}

	async Task SaveMemorySnapshot (int number) {

		bool snapshotTaken = false;
		
		MemoryProfiler.TakeSnapshot(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), string.Format("{0}_MemorySnapshot{1}.snap", DateTime.Now.ToString("yyyy-MM-dd_hh-mm"),number)), delegate (string filepath, bool success) {
			if (success == false) {
				Debug.LogError("Failed to take memory snapshot " + number);
			}
			snapshotTaken = true;
		}, CaptureFlags.ManagedObjects | CaptureFlags.NativeObjects | CaptureFlags.NativeAllocations | CaptureFlags.NativeAllocationSites | CaptureFlags.NativeStackTraces );

		while (snapshotTaken == false) {
			await Task.Delay(1);
		}
	}

	async Task LoadScene (string sceneName) {
		AsyncOperation asyncLoad = SceneManager.LoadSceneAsync(sceneName, LoadSceneMode.Additive);

		while (!asyncLoad.isDone) {
			await Task.Delay(1);
		}

		Scene disposableScene = SceneManager.GetSceneByName(sceneName);
		SceneManager.SetActiveScene(disposableScene);
	}

	async Task UnloadScene (string sceneName) {

		AsyncOperation asyncUnload = SceneManager.UnloadSceneAsync(sceneName, UnloadSceneOptions.UnloadAllEmbeddedSceneObjects);

		while (!asyncUnload.isDone) {
			await Task.Delay(1);
		}
	}

	async Task ForcedMemoryCleanup () {
		
		// Wait a single frame for any existing destroys unity has queued to have completed
		await Task.Delay(1);

		bool unusedAssetsUnloaded = false;

		StartCoroutine(UnloadUnusedAssetsCoroutine(delegate () {
			unusedAssetsUnloaded = true;
		}));

		while (unusedAssetsUnloaded == false) {
			await Task.Delay(1);
		}
		
		GC.Collect();
	}

	IEnumerator UnloadUnusedAssetsCoroutine (Action callback) {
		yield return Resources.UnloadUnusedAssets();
		callback();
	}

}

Memory Profiler Results

All snapshots are taken from a windows development build. I’m trying to get my head round it there before I complicate things by going back to iOS/iPadOS.

Please see the script above for context on when the snapshots below were taken.

Things I don’t understand

  1. Why has resident memory still increased by 30.9MB? I’m working on an iPad application which needs to load multiple models in a sequence on devices with only 2GB-3GBs of ram, we can’t afford to keep leaking memory every time we change model.
  2. In the final comparison why is the “Diff” in “Total Resident on Device" 30.9 MB, but the combined Size Difference in the “All of Memory” Tab less then 16MB? Is “All Of Memory” actually just some of memory?
  3. The empty scene load+unload seems to clear around 16MB of untracked memory while marginally increasing tracked memory use… Yet resident memory AKA the “Crashy Memory” I predominantly care about, only goes up, why?
  4. Does iOS/iPad even have non-resident memory? I have a terrible feeling it might not leverage hard drive space and ultimately, I should be worrying more about total allocated memory regardless.

A thousand thank yous to anyone who read through all that.

First off, you got the point of the second scene unload wrong. That is only relevant in so far as that will trigger a second Assets GC (aka Resources.UnloadUnusedAssets) cycle AFTER the scene has been unloaded instead of during it. Since you are manually calling Resources.UnloadUnusedAssets already, that second Scene load-unload does nothing but load more stuff to memory and touch more files (to potentially get mapped.)

Second, you’re assuming you are leaking something. If you repeat the cycle, does memory usage keep growing like this? Or is it just some reserved amounts or something that gets an initial load into memory? Do one scene load-unload-assetGC-cycle first, then take your first snapshot, load your scene, take the second snapshot, unload it, take a third, asset GC, take a 4th.

The memory growth between the first and the 4th should now be relatively minimal. The other ones are more to demonstrate what actually got loaded and unloaded in-between, and to proof that it got unloaded and was not leaked.

Now to your list:

  1. Files got touched, reserved memory was recently used and made resident, when that becomes non resident again is kinda hard to determine as that is somewhat up to the OS. And see above, your basing your test setup on somewhat faulty assumptions and extrapolating continuous growth is not a given based on one cycle with one scene/model.
  2. The All Off Memory comparison doesn’t actually compare Resident memory changes but Allocated, which is the total length of the bar of that Resident vs Allocated bar. That comparison table modes do not offer a way to compare Resident amounts is a separate issue were aware of. It isn’t less than resident, but more and it changed less. As I mentioned previously unused reserved memory got usedy making it resident.
  3. You’re churning through memory usage by loading and unloading stuff, thereby touching more memory and making it resident.
  4. The answer to how that’ll behave is to be found in actually running it on a device, but as I mentioned, consider cleaning up your test scenario for that first so you are testing something somewhat more meaningful and less liable to wrong extrapolation, e.g. doing multiple cycles in a row and checking if memory usage stabilizes or grows continuously.

Also, you can unload those tree elements in the comparison view to see where the growth occurs and then switch to single mode to at least get some idea of how much of that growing memory category is Resident vs not by manually comparing it across the snapshots.

1 Like

To add to this, the Task.Delay() is what I feel uneasy about. I’d rather use a coroutine that yields rather than await since awaiting a delay is not normally found in actual production code.

Most importantly, Task.Delay is not available on WebGL if that’s the target platform.

Thanks guys, that’s all incredibly helpful!

Though with the above in mind it strikes me as weird that the total allocated memory goes down after my unnecessary post-GC second scene load. Maybe that’s just more of the OS being fickle?

In anycase I’ll report back after I’ve cleaned up my testing scenario, thanks again!

Hi Guys,

I’ve implemented @MartinTilo 's suggestions, but the results still baffle me. It looks like memory use increases at every stage of the second cycle. I must be doing something wrong somewhere…

Here’s my latest script revision :

using System;
using System.Collections;
using System.IO;
using System.Threading.Tasks;
using UnityEngine;
using Unity.Profiling.Memory;
using UnityEngine.SceneManagement;

public class RunThis : MonoBehaviour {

	const string MODEL_SCENE_1 = "ModelScene1";

	private int cycleCount = 0;

	void Start() {
		Demonstration();
 }

	async void Demonstration () {

		await Cycle(MODEL_SCENE_1, false);
		await Cycle(MODEL_SCENE_1, true);
	}

	async Task Cycle (string sceneName, bool doSnapshots) {

		cycleCount++;

		await Task.Delay(2000);

		if (doSnapshots) await SaveMemorySnapshot("pre-scene-load");
		
		await Task.Delay(2000);

		await LoadScene(sceneName);
		
		await Task.Delay(2000);

		if (doSnapshots) await SaveMemorySnapshot("post-scene-load");

		await Task.Delay(2000);

		await UnloadScene(sceneName);

		await Task.Delay(2000);

		if (doSnapshots) await SaveMemorySnapshot("post-scene-unload");

		await Task.Delay(2000);

		await ForcedMemoryCleanup();

		await Task.Delay(2000);
		
		if (doSnapshots) await SaveMemorySnapshot("post-GC");

		await Task.Delay(2000);
	}

	async Task SaveMemorySnapshot (string desc) {

		bool snapshotTaken = false;
		
		MemoryProfiler.TakeSnapshot(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), string.Format("{0}_{1}_MemorySnapshot_{2}.snap", DateTime.Now.ToString("yyyy-MM-dd_hh-mm"), cycleCount, desc)), delegate (string filepath, bool success) {
			if (success == false) {
				Debug.LogError("Failed to take memory snapshot " + desc);
			}
			snapshotTaken = true;
		}, CaptureFlags.ManagedObjects | CaptureFlags.NativeObjects | CaptureFlags.NativeAllocations | CaptureFlags.NativeAllocationSites | CaptureFlags.NativeStackTraces );

		while (snapshotTaken == false) {
			await Task.Delay(1);
		}
	}

	async Task LoadScene (string sceneName) {
		AsyncOperation asyncLoad = SceneManager.LoadSceneAsync(sceneName, LoadSceneMode.Additive);

		while (!asyncLoad.isDone) {
			await Task.Delay(1);
		}

		Scene disposableScene = SceneManager.GetSceneByName(sceneName);
		SceneManager.SetActiveScene(disposableScene);
	}

	async Task UnloadScene (string sceneName) {

		AsyncOperation asyncUnload = SceneManager.UnloadSceneAsync(sceneName, UnloadSceneOptions.UnloadAllEmbeddedSceneObjects);

		while (!asyncUnload.isDone) {
			await Task.Delay(1);
		}
	}

	async Task ForcedMemoryCleanup () {
		
		// Wait a single frame for any existing destroys unity has queued to have completed
		await Task.Delay(1);

		bool unusedAssetsUnloaded = false;

		StartCoroutine(UnloadUnusedAssetsCoroutine(delegate () {
			unusedAssetsUnloaded = true;
		}));

		while (unusedAssetsUnloaded == false) {
			await Task.Delay(1);
		}
		
		GC.Collect();
	}

	IEnumerator UnloadUnusedAssetsCoroutine (Action callback) {
		yield return Resources.UnloadUnusedAssets();
		callback();
	}

}

Here are the snapshots themselves in case anyone wants a closer look:
MemorySnapshots.zip

I find async coding a lot easier to parse then coroutines and in our larger projects (not used for any of these tests,) it’s used by many of our dependencies, including at least one on the pseudo-official unity registry. So, I thought I better share the results with my async stuff left in first.

When I get the time I’ll switch my code over to coroutines and upload the results of that to see if that’s where I’m going wrong.

So if you compare the 3rd to the 4th snapshot:

you can see that some Meshes, Materials, Textures and Shaders have been successfully cleaned up by the Resources.UnloadUnusedObjects. Their memory and memory that was no longer needed for some Unity Subsystems, has been freed and is now tracked as Native Reserved, as you’d expect. (Our native allocators mostly don’t return memory once allocated back to the OS but keep it reserved for later reuse, e.g. for that next level and model you might want to load)

The Graphics side of these assets has also been released. Since that memory is not handled by Unity directly but by the graphics Driver (aka the Graphics API you’re using) Unity just has book-keeping units of their amount that are deleted once they are no longer used. Unity never knows their precise location in Memory, but the Memory Profiler package will subtract their amount from the Untracked sub-elements most likely to contain them for the respective platform. Once they are no longer used, its up to the Gfx Driver to do with that memory what it wants, e.g. keep it as buffers for the next thing it’ll need. So some growth in Untracked can more than likely be attributed to that.

So we’ve established that memory actually does get freed.

If we look at what got loaded into memory inbetween of snapshot 1 and 2:


We can see those same objects getting loaded into native and graphics memory. That plus the scene objects that use them.

We can also see that the Temp allocator grew a bit, probably during the load operation. The memory used by this allocator never lives longer than a frame and the grown capacity for it will be reused for later temp allocations so that is also probably just fine. You could possibly mess with the native allocator customization feature by reducing the block size and forcing more temp allocations to fall back onto slower allocators instead of growing this one, but I doubt that’s a really useful optimization to make. (And we can obv. ignore the fact that the Memory Profiler allocator grew as that won’t ever happen in practice)
BTW, with the Temp allocator we’re sure that’s not memory that got released because the memory allocated for it isn’t actually instrumented to keep its overhead low.

Oh and, to get that Reserved memory breakdown, go to the Memory Profiler Settings and enable it:

Now if you look at snapshots 2 and 3


You can see those Scene Objects getting unloaded with the scene, so that’s all nice and tidy as well.
The TempJob allocator grew, for some jobs that ran, possibly during the unload, future job memory usage will reuse it. You could mess with the allocator settings but will likely just make things slower and force more allocations to use slower fallback allocators. (Also, same thing as for the Temp allocator applies, we’re avoiding the performance overhead of actually tracking any allocations within that, so that’s just the base capacity that grew)

Now back to that comparison from 1 to 4, aka how did the base line change:


yes, untracked grew, though probably half of that are graphics buffers held by the Gfx Driver. The Temp allocators grew but their memory will be reused and helps things run as fast as they can, messing with them will likely just make the code that relies on these run slower.

There is an annoying growth in Unknown native memory. What that is is native allocations that have not been registered with a native root, like e.g. the manager they relate to. That is mostly a book-keeping bug and not necessarily proof that something leaked in native. That said, we can’t really reason about these without investigating that on our side with a repro project and native debugging (…unless you happen to have a source code license). If that book keeping was fixed, it would at least tell us what system is behind that. I’ve only very recently made investigating that on our side a bit easier and plan to do some investigations there to get them fixed in so far as I can reproduce them with the example cases I have. Hopefully those efforts will lower that black box going forward. However, that’s also just half an MB of growth.

The Scripting VM also grew by 5MB and some minor managed objects were retained (mostly some UI property caches and some strings held in memory by your async task). So that’s probably entirely negligible.

If you feel like digging deeper into those allocations you could use a native platform profiler, but I don’t see much of a case for an actual uncontrollable grow of memory usage here. Or at least nothing that won’t taper out to something pretty stable if you were to repeat that cycle say 20-50 times more…

1 Like