Managing memory when doing operations on tensors

I’m executing a model about once every frame which works well so far, but I want do some processing on the output tensor on-device.
I’m using tensor operations with code that looks like this:


Somewhere in my Start() method:

_allocator = new TensorCachingAllocator();
_ops = WorkerFactory.CreateOps(Backend, _allocator);

And later (each time after running worker.Execute())

TensorFloat minT = _ops.ReduceMin(depth as TensorFloat, null, false);
TensorFloat maxT = _ops.ReduceMax(depth as TensorFloat, null, false);
TensorFloat a = _ops.Sub(depth as TensorFloat, minT);
TensorFloat b = _ops.Sub(maxT, minT);
Tensor normalized = _ops.Div(a, b);

However, this will leak memory (‘Graphics’ memory in memory profiler increasing steadily). How am I meant to manage memory when using tensor operations? I tried disposing the intermediate tensors, but the documentation doesn’t say I need to and it didn’t fix the issue either way.

I’ve read this other thread, that talks about a different way to do custom calculations, but i don’t see it in the official documentation anywhere. So I’m a bit confused about the best practice here.

Hope somebody can help. Thanks!
and cf
Samples\Do an operation on a tensor
Remember to dispose the op and the allocator :slight_smile:

1 Like

Thanks for the quick answer!

Just to make sure I understand correctly: Ops are not meant to be reused and I’m supposed to dispose them and recreate them each time I run the model again?

Doing that works in that it doesn’t leak any memory. But seems a bit counterintuitive to me, because I need to do the same processing every time, so my intention was to reuse that memory. Similar to how i would reuse the same intermediary RenderTexture to do some post-processing each frame and only dispose of that in OnDestroy() for example.

If that’s not the intended design, that’s fine by me, but then the docs should probably mention this on the Managing memory page. In some samples like this one, the ops are not being disposed either.

In terms of efficiency: Is it fine to assume the memory needed for these Ops get cached between runs anyway instead of constantly allocated/deallocated?
The way I understand this part in the Managing memory docs

You don’t need to call Dispose for […] Ops objects you create with no allocator

is that calling WorkerFactory.CreateOps() has some different behaviour in terms of caching depending on whether you pass null or a new TensorCachingAllocator, but I’m not sure I understand when to use which.
And neither seems to allow to me to have that behaviour of not needing to call Dispose.

I feel like I might be misunderstanding something fundamental, but the docs don’t help much either.

If you create an op (without allocator) you can use it on its own and it will reuse the memory.
But you need to dispose of it at the end of your run (OnDestroy).
Internally we use the allocator to keep track of allocations and re-use tensors when they are not needed/disposed/out of scope.
We offer the ability to construct an op with an allocator if you want to share allocs with other processes…

Follow the sample it gives a good overview.
Create an op. use it. dispose of it ondestroy

That is the way I’ve been doing it initially, but only disposing in OnDestroy() gives me the mentioned memory leaks. (keep in mind I’m running Worker.Execute() and ops on its output tensor every frame, not just once)

Maybe you can take a look at the code I am using, specifically the WebcamSample of this package: GitHub - Doji-Technologies/com.doji.midas

(memory increases continually in the profiler either under ‘Graphics’ or ‘Native’ section, note that unlike previously stated even disposing everything each frame there still seems to be a memory leak somewhere)

private Tensor Normalize(Tensor depth) {
   _ops = WorkerFactory.CreateOps(Backend, _allocator);

you are re-creating a new ops every frame, and only disposing one at the end.
It’s causing intermediate ops to not be disposed and thus leak memory.
Solution is to create the ops only once similar to the way you are creating a worker

I am also disposing the ops in line 109. But I’m only doing it that way, because that’s how I understood your initial reply.

Initially (the commit before), I have been creating the ops only once in Start, but that causes the memory usage to continually increase until running out of graphics memory and crashing, which was the point of this post (you can check the memory profiler if you want to see for yourself)

@julienkay actually you are correct, it seems that our ops doesn’t flush temp allocs.
I’ll dig in.
Thanks for spotting it

1 Like

cool, thanks for taking the time to look into it!

@julienkay we have identify the issue and will fix it.
In the mean time you can add this to your model to do the normalization

        var model = ModelLoader.Load(estimationModel);
        var output = model.outputs[0];
        model.layers.Add(new Unity.Sentis.Layers.ReduceMax("max0", new[] { output }, false));
        model.layers.Add(new Unity.Sentis.Layers.ReduceMin("min0", new[] { output }, false));
        model.layers.Add(new Unity.Sentis.Layers.Sub("maxO - minO", "max0", "min0"));
        model.layers.Add(new Unity.Sentis.Layers.Sub("output - min0", output, "min0"));
        model.layers.Add(new Unity.Sentis.Layers.Div("output2", "output - min0", "maxO - minO"));
        modelLayerCount = model.layers.Count;
        model.outputs = new List<string>() { output , "output2" };
        m_engineEstimation = WorkerFactory.CreateWorker(BackendType.GPUCompute, model);
1 Like

This issue is known internally as Task 189

1 Like

Sweet, thanks for posting the workaround!
Looking forward to a proper fix as well.

Can’t edit this anymore, but the repo moved here in case anyone needs it:

1 Like