Hello, thank you for the response.
While a lot of devices supports SSBO, it doesn’t mean that it’s the best option for that platform, and even more it doesn’t mean it’s a best solution for a specific case (again, changing the way it works on engine side will allow to still implement DOTS, but will also allow implement broader amount of solutions).
While it’s true that Raw and Structured buffers are simillar, raw buffers doesn’t allow driver to optimize load alignment, which is pretty much the case with Structured buffers.
Lifting off the limitation on binding buffer+offset (which is supported on ALL platforms) per fat draw-call (visibility data/batch data) allows to impliement faster GPU culling, without duplicating materials (overhead) and without creating a lot of buffers (even more overhead).
Given that Unity still doesn’t support bindless for some reason, it’s needed to invoke a lot of dispatches (which are also have immense CPU overhead in Unity) to cull multiple buffers, while with buffer+offset scheme it’s totally possible to cull everything in just a couple of dispatches.
Same goes for per-instance (or batch) data. Especially when adding draw instructions for lights, where multiple fat draw calls are needed, because it’s otherwise impossible to specify separate per-split instances.
Also MDI is still not supported while it’s just a matter of hours to implement with native plugin, but ofc native plugin API is limited and can’t access Unity’s compiled PSOs, which require to use tons of hacks.
I mean there’s a base instance id problem on a lot of DX12 devices, but nobody asking to support something unsupported on the current platform, cmon. Having a limited option is much better than have none.
I also don’t understand the limitation on per-fat drawcall (cpu’s one) data. I can only provide 4 bytes per instance (visibility id). Why it’s not possible to provide custom per-instance data, if visibility instances data is still written from CPU to GPU (16B per instance is written). Could be just a nice API with any arbitrary data.
Per-fat-drawcall SRV bindings would be an OP to use, for the use cases where custom light/refprobe/lightprobe/other volumes make sense (I have some people suffering from that and it require tons of effort to do with hacks). Same for per-fat-drawcall const buffer bindings.
The solution I’ve implemented used to make instancing not on per-material basis, but contrary to GRD more closely to per-instance properties, and I use AoS instead of SoA (contrary to DOTS) for better cache coherency and less overheady loads (less indirections) since texture load cache is not infinite.
Also Unity for some reason doesn’t support copying part of the buffer to another part of the buffer, which is supported across all platforms existing nowadays which support buffers.
P.S. I’m sorry if my critique sound harsh, it’s mostly because of frustration and knowledge that there are more far better options in a form of low-hanging fruits, which wouldn’t take long to implement.
I love the engine and want to prosper and be used more frequently by higher grade of developers.