Hi,
I am currently trying to write a ComputeShader for raytracing octrees, which uses some kind of per core buffer.
The buffer is only used to keep track of some data during this process.
uint index;
BufferData buffer[16];
The compute shader writes randomly to the buffer on the index and advances the index. The same goes for read, only that it subtracts the index. In theory it should behave like a stack.
This is what a simple write/read looks like
castStack[depth++] = data;
data = castStack[--depth];
But this gives me an error
Compilation failed for kernel 'CSMain' [0x80004005 - unknown error] 'internal error: compilation aborted unexpectedly
' at kernel CSMain
If any more information is required, or if I should post the whole program I can do that as well.
I tried to figure out where the problem lies (maybe its a GPU limitation for not being able to dynamically index the array) but google wont tell me a thing or at least I dont know what to look for, so I figured I would try my luck here.
Hi!
Please submit a bug report, we’ll take a look at what’s happening.
Dynamically indexing such kinds of arrays can be problematic in shaders.
In C++ and C# an array is ultimately a pointer to some memory, and indexing is just adding an offset to that pointer. But shaders don’t have that: all mutable variables are stored in registers. So the compiler actually implements such arrays as multiple variables, and any dynamic indexing is compiled into a bunch of conditional selectors for reads, and branches for writes. This can get very hairy very fast if you’re writing to a dynamic index inside a dynamic loop with your own dynamic branches on top. While the shader compiler shouldn’t crash, it’s not all that surprising it did.
You need to either rethink your algorithm to avoid writing to such arrays, or use a groupshared buffer for that. Groupshared buffers are a special feature only available to compute shaders, which is a dedicated fast memory which is shared by a compute shader group. All threads in a group can read and write to it, and see the data written by the other threads, which allow compute shaders to do things pixel shaders cannot.
Just be aware that when using groupshared memory you need to use barriers to synchronize access to data written by a different thread, since the writing order is undefined and it’s not instantaneous.
In your case, if the stack is per thread, you could create a groupshared array that has the length set to (number of threads) * (stack size), then use the thread index as an offset into it so each thread has its own part of the array to use as their private stack.
2 Likes
So it turned out that it magically fixed itsself. I was just trying to get that to work by creating a new shader and pasting in every line by line to figure out which line was the problem. But turns out doing this fixed the problem and now it compiles perfectly.
I would have loved to submit a bug report, because this issue cost me around 2 days, which was especially frustrating when it turned out that it fixed itsself by adding another compute shader …
Really dont know what happened there…
Thanks anyways for the help.