Compute Buffer Memory Management

Blackmagic919 · February 21, 2024, 6:47am

Background

So I’m creating a procedural terrain generation in which all the information is continuously fed through compute shaders and stored between compute buffers.

I’ve managed to process everything so far without ever reading back to the CPU until the mesh is generated at the very end. I read-back the mesh asynchronously at the very end for multiple reasons(collision mesh, memory, etc).

However, I want to keep some data on the GPU, such as geometry shader data, density maps so shaders can use them, etc. Because I’m doing everything through compute shaders, I cannot know the exact size of the mesh or constituent parts necessary for its generation (structure data, helper maps, etc.), so I’m always wasting memory because I have to create compute buffers at their maximum possible size.

Resultantly, I’ve created ~2-3 compute buffers of a collective 3.5 GB in size, to which I’m copying all my data to. These buffers are managed similar to the Heap in memory and are quite complicated in nature, but they are persisted throughout the scene.

So for every chunk, I’m progressively creating and releasing compute buffers, copying some of their data to these persisted buffers, and then releasing these temporary buffers. These temporary buffers can take up to ~0.75-1 GB at a time before they are released.

Problem

That leads into my problem. I’m experiencing some sort of memory corruption inside these persisted buffers. It is not always apparent, and has only appeared in certain cases. One of these examples is when I create and release medium size buffers rapidly through frames, have some sort of memory leak, or a shader compiles with errors.

I believe the problem is Memory Fragmentation on the GPU. I assume that what is happening is that, for example, when creating rapid medium sized buffers and releasing them, I fragment the memory, and when a new temporary buffer cannot find an empty space, it overwrites the memory in the persistent buffers.

In another case, I’m using a persisted compute buffer to bake the SunOpticalDepth in calculating inScattering, but there are frames where it produces gibberish while the terrain generates. The compute buffer is never released or re-created so this phenomenon is really mysterious. I believe one reason of this could be because the GPU is trying to rearrange the memory because it can’t find space for new data, but I’m not sure.

Could something like this happen? How are compute buffers on the GPU physically being sorted? Should I ask an OpenGL thread?

Blackmagic919 · February 21, 2024, 7:27am

If resource allocation is the problem, would a plausible solution be creating a set amount of compute buffers when the scene loads (say ~4-5 gB in size total), and never allocating any new buffers during runtime? If I take care of memory management by myself, the GPU memory can’t fragment right?

Edit: Setting ComputeBufferMode.Immutable seems to help. Maybe
ComputeBufferMode.Immutable helps preserve data integrity?

Invertex · February 25, 2024, 5:17pm

Yes this is common in large world games especially. Having static “buckets” that you can manipulate data in to avoid frequently changing memory mapping and free space recycling efforts. And you manage the size and positions of data in your buckets yourself. This can also help you ensure a more linear layout of data that will be processed together, and the GPU can make more optimization assumptions (when handled correctly. If you have lots of compute-time derived lookups then this can go the opposite way).

As for immutable, yes, that’s true on the CPU as well. The compiler can make more assumptions about how the data might possibly get used, so knowing that it’s immutable means it doesn’t need to worry about write pathways or interlocking, among other things.

Blackmagic919 · March 3, 2024, 11:24pm

Thanks for the advice! So I kind-of figured out half the issue–I was reading data outside of my buffer(like I always do) and fixed that.

I do have static buckets where I offload data after the data is generated to a temporary buffer–the general process is like this.

I have some data generated and put in append buffers, this data is like structure information, marched triangles, geo shader geometry etc. Then I can copy their size to another compute buffer and use that size to find a place in a large structured buffer ‘bucket’ in which I copy all the data. Then I can release these temporary append buffers.

I was just wondering–should I keep one big append buffer as well and reset its count whenever I use it again–in such a way that I never create a new buffer in runtime? Would that help performance by reducing the time spent on GC.Alloc()?

Invertex · March 4, 2024, 8:01pm

Yeah that’s a main use of the append buffer. You usually shouldn’t need to be creating them constantly but just clearing their count value. Allocate enough capacity headroom for your expected needs.
Using memory space doesn’t have any real impact on performance, it’s when you run out of space that it becomes a problem. So if you can hold on to that append buffer for reuse instead of deallocating to free memory, absolutely do it.

Topic		Replies	Views
Compute Shader memory usage problem Unity Engine Shaders , Performance , Question	5	3360	August 28, 2021
Disposing of computeBuffer causes result to screw up. Unity Engine Shaders , Scripting	3	2514	September 17, 2020
Updating a ComputeBuffer every frame Unity Engine Graphics , Performance	1	2979	June 23, 2020
Repeated calls to GetData on compute buffer causes huge GPU memory use? Unity Engine Shaders , Bug	1	988	January 22, 2023
GetData Allocate 15GB of memory (ComputeBuffer) Unity Engine Scripting , Performance	1	571	April 19, 2022

Compute Buffer Memory Management

Related topics