I’m trying to move over an array of constant structs from a structuredBuffer to a Cbuffer in a compute shader. Was having issues, and isolated them to a simple test case:
#pragma kernel CSMain
struct testStruct
{
float4 testVal1;
};
cbuffer Preferences_Buffer
{
testStruct testPref;
};
[numthreads(8,8,1)]
void CSMain (uint3 id : SV_DispatchThreadID)
{
// TODO: insert actual code here!
Result[id.xy] = testPref.testVal1;
}
But I recieve the error :
“Unknown parameter type (0) for testPref at kernel CSMain”
which in all my google-fu I can’t find similar errors of online.
I can find code online of people using structs in cbuffers:
https://gamedev.stackexchange.com/questions/71628/how-do-i-define-array-in-shaders-constant-buffer-with-c
so I’m not sure what I’m doing wrong. I can also just throw a float4 in the cbuffer and can access it fine, so the syntax of that is correct.
Can provide any other info requested, just kind of at a loss for what I could be doing wrong.
Bump, also experiencing this issue with compute shaders.
I never got it to work, just switched back to structured buffer till I have more time to work on potential perf improvements.
Anyone solved this issue?
This is not supported in Unity at the moment.
Why do you want to use constant buffers specifically?
@funkyCoty and why exactly do you need a struct there?
Can’t you do the same with a set of arrays?
But it’s significantly easier to use with a struct, both on the C# side and the HLSL side, plus easier for the GPU to read since data is packed closely together. I think those are quite important features.
Different graphics APIs have different packing rules for structs. The data isn’t necessarily packed tightly.
Also, if you are accessing just a part of a struct in some shader stage, individual arrays are going to be faster to access, as it can fit more data into the cache.
Wouldn’t there be more jumping between distant memory locations with parallel arrays if you have a lot of data? One array dereference is a lot cheaper than jumping between potentially a dozens of arrays. Processing that much data for a fragment/vertex shader doesn’t make sense, but for an async compute shader, the amount of data per struct can start to add up.
For an example use case, I wanted to use a constant set of preferences for generating particles. The preference has min/max lifetime, texture index, additional color tint, velocity multiplier, rotation multiplier, etc etc. So, for example, given a shade of red it would index different preferences from this constant array. Packing all this data into a struct would be ideal.
That said I understand your reasoning, disappointing I can’t do this in unity, but for my pretty specific edge case, not breaking compatibility with other graphics API makes sense.
Thanks for the reply and insight!
Constant buffers are limited to 64KB, so there’s not much jumping around to begin with 
For this specific use case with several arrays the cost doesn’t come from the dereferencing itself, but from the need to access the memory - if you have to load more data than you need, cache space is wasted and cache misses occur more frequently. It’s definitely easier to achieve close to 100% useful cache occupancy with structs if they are packed tightly, but it’s not impossible with arrays as well, just harder. You could, for example, pack everything into float4
(for example, min and max lifetime, velocity and rotation multipliers would go into a single float4) to reduce cache misses.
This all depends on the size of a single cache line. If it’s 128 bits (or lower, which doesn’t sound likely), you will have perfect cache occupancy with this simple setup. If it’s 256 bits, you could pack your data into two sequential float4s in this array, and so on. Such a setup is harder to work with, but emulates structs quite well 
1 Like
Oh! I didn’t even think of putting it all in one big float4 array, but that makes sense! Well I’ll keep that in my back pocket next time I need to scrape for a little more perf. If I end up giving it a try I’ll post some profiler comparisons here.
1 Like
Basically what has already been discussed here. I’m trying to squeeze out a bit more performance, and it looks like a constant buffer is a good candidate. I’m already using packed arrays. I need to access all of the data anyways, so a constant buffer seems to be the better choice. It’s a lot of data (more than a float4) per struct, but there are not that many structs, in my case. (its for some gpu particle sim stuff)
If you’re already using a constant buffer with packed arrays, it sounds like a matter of convenience rather than a performance question.