Compute Buffer Memory Management

Background

So I’m creating a procedural terrain generation in which all the information is continuously fed through compute shaders and stored between compute buffers.

I’ve managed to process everything so far without ever reading back to the CPU until the mesh is generated at the very end. I read-back the mesh asynchronously at the very end for multiple reasons(collision mesh, memory, etc).

However, I want to keep some data on the GPU, such as geometry shader data, density maps so shaders can use them, etc. Because I’m doing everything through compute shaders, I cannot know the exact size of the mesh or constituent parts necessary for its generation (structure data, helper maps, etc.), so I’m always wasting memory because I have to create compute buffers at their maximum possible size.

Resultantly, I’ve created ~2-3 compute buffers of a collective 3.5 GB in size, to which I’m copying all my data to. These buffers are managed similar to the Heap in memory and are quite complicated in nature, but they are persisted throughout the scene.

So for every chunk, I’m progressively creating and releasing compute buffers, copying some of their data to these persisted buffers, and then releasing these temporary buffers. These temporary buffers can take up to ~0.75-1 GB at a time before they are released.

Problem

That leads into my problem. I’m experiencing some sort of memory corruption inside these persisted buffers. It is not always apparent, and has only appeared in certain cases. One of these examples is when I create and release medium size buffers rapidly through frames, have some sort of memory leak, or a shader compiles with errors.

I believe the problem is Memory Fragmentation on the GPU. I assume that what is happening is that, for example, when creating rapid medium sized buffers and releasing them, I fragment the memory, and when a new temporary buffer cannot find an empty space, it overwrites the memory in the persistent buffers.

In another case, I’m using a persisted compute buffer to bake the SunOpticalDepth in calculating inScattering, but there are frames where it produces gibberish while the terrain generates. The compute buffer is never released or re-created so this phenomenon is really mysterious. I believe one reason of this could be because the GPU is trying to rearrange the memory because it can’t find space for new data, but I’m not sure.

Could something like this happen? How are compute buffers on the GPU physically being sorted? Should I ask an OpenGL thread?

If resource allocation is the problem, would a plausible solution be creating a set amount of compute buffers when the scene loads (say ~4-5 gB in size total), and never allocating any new buffers during runtime? If I take care of memory management by myself, the GPU memory can’t fragment right?

Edit: Setting ComputeBufferMode.Immutable seems to help. Maybe
ComputeBufferMode.Immutable helps preserve data integrity?

Yes this is common in large world games especially. Having static “buckets” that you can manipulate data in to avoid frequently changing memory mapping and free space recycling efforts. And you manage the size and positions of data in your buckets yourself. This can also help you ensure a more linear layout of data that will be processed together, and the GPU can make more optimization assumptions (when handled correctly. If you have lots of compute-time derived lookups then this can go the opposite way).

As for immutable, yes, that’s true on the CPU as well. The compiler can make more assumptions about how the data might possibly get used, so knowing that it’s immutable means it doesn’t need to worry about write pathways or interlocking, among other things.

Thanks for the advice! So I kind-of figured out half the issue–I was reading data outside of my buffer(like I always do) and fixed that.

I do have static buckets where I offload data after the data is generated to a temporary buffer–the general process is like this.

I have some data generated and put in append buffers, this data is like structure information, marched triangles, geo shader geometry etc. Then I can copy their size to another compute buffer and use that size to find a place in a large structured buffer ‘bucket’ in which I copy all the data. Then I can release these temporary append buffers.

I was just wondering–should I keep one big append buffer as well and reset its count whenever I use it again–in such a way that I never create a new buffer in runtime? Would that help performance by reducing the time spent on GC.Alloc()?

Yeah that’s a main use of the append buffer. You usually shouldn’t need to be creating them constantly but just clearing their count value. Allocate enough capacity headroom for your expected needs.
Using memory space doesn’t have any real impact on performance, it’s when you run out of space that it becomes a problem. So if you can hold on to that append buffer for reuse instead of deallocating to free memory, absolutely do it.