compute shader: how to swap buffer at the end of a compute

Let’s say I have result data that kernel “compute” evaluates from buffer data, at the end of this evaluate I want all this result data to be copied over to buffer so the next execution. How do I do that?

I currently have two kernels, “compute” and “swap” and after I dispatch compute I dispatch swap. Not working, maybe because compute buffers are local to one kernel.
I read somewhere than you do this swap thing by setting up registry, then my brain melted.

Is there a way to have the compute shader somehow run that buffer swap at the end of all the threads, somehow? Or do I need to dispatch sequentially from the c# code?

Note: the end result will be copied over to an array in c# land.

Compute buffers are not local to one kernel. If you call these:

shader.SetBuffer(kernel1, "buffer", buffer)
shader.SetBuffer(kernel2, "buffer", buffer)

then each kernel will have access to “buffer”.

I would do the swap this way: on C# side there would be a variable that contains buffer index. At start of each Update() it swaps to the next index, then buffer.SetInt() gets called to put this buffer index to shader side.

And then you run your “compute” shader, that writes its results to one of the buffers, depending on the buffer index.

And then you read one of the buffers depending on buffer’s index variable. So it would be something like this:

bufferIndex == 0? bufferIndex = 1 : bufferIndex = 0;
shader.SetInt("bufferIndex", bufferIndex);
shader.Dispatch(computeKernel, ...);
bufferIndex == 0? buffer1.getData(...) : buffer2.getData(...);

and on HLSL side:

compute (...){
    if (bufferIndex == 0)
        buffer1[id.x] = ...;
    else
        buffer2[id.x] = ...;
}

though, I heard those conditions don’t do well, so you might consider having two versions of compute kernel, each will write to its own buffer, and you will call one of them depending on bufferIndex.

Thanks Z, I didn’t know that SetBuffer of the same buffer needed to be done twice, once for each kernel.

What I needed is in fact a copy at the end of the processing:

void Compute(...)
{
   result[id.x] = buffer[id.x]*...
}
void Copy(...)
{
   buffer[id.x] = result[id.x];
}

When the CS has two kernels though, I can’t access either of them, weird uh, so I gave up for now and moved on to some job stuff.

Note: for your swap, how about merging both buffer1 and 2 in one mega buffer tha’s twice the size and index it with [id.x + bufferIndex*halfLength].

Good idea.

Btw, do you need double buffer to use it with async version of GetData()?

No it’s because I want to keep blurring this field of data as time passes, that’s how I do diffusion.
And it’s not a swap, it’s a copy to buffer.

I copy to buffer to avoid race condition, but maybe gpu don’t have that, I’ll try result[index] = (emitter[index] + result[index+1] + result[index-1]) / 3 and see if it causes artifacts.

so I verified with this simple shader and I don’t see any artifact so I’ll just do that, it seems that gpu store outputs in a buffer, or maybe the blur hides the artifact :smile: good enough for my purposes

#pragma kernel CSMain

Texture2D<float4> init;
RWTexture2D<float4> tex;
RWStructuredBuffer<float> data;

[numthreads(8,8,1)]
void CSMain (uint2 id : SV_DispatchThreadID)
{
    float w,h;
    tex.GetDimensions(w,h);
    tex[id] = (init[id]
        + tex[uint2(id.x,id.y+1)]
        + tex[uint2(id.x,id.y-1)]
        + tex[uint2(id.x+1,id.y)]
        + tex[uint2(id.x-1,id.y)]       
        )/5;
}