I’m interested in indirect drawing, i.e. preparing and culling draw calls directly on the GPU. It seems like RenderMeshIndirect(Unity - Scripting API: Graphics.RenderMeshIndirect) was made for this, but it has some drawbacks
In the manual it says to pass in the SV_DrawId semantic for multidraw support. As far as i can tell though, this semantic is not supported. The shader compiler simply doesnt recognise it at all. Surely this must be possible somehow, the entirety of UnityIndirect.cginc relies on this?
The need for this, and most of UnityIndirect.cginc, is to properly offset the InstanceID. Without this, the StartInstance property of the IndirectArgs seems to be ignored, but only on DirectX platforms (reported incident IN-41509). In Vulkan the InstanceId offsets correctly, there is no need for any of this there.
I’m not sure how i can get multidraw working without this, in a performant way. We could call RenderMeshIndirect once for each draw, and add a property for the draw id. I think thats how its supposed to be done in dx11, but we target dx12 which should be capable of doing this out of the box with ExecuteIndirect. That would allow us to batch many more draw calls. Native plugins were a dead end too, i couldnt figure out how to prepare a managed mesh and material for rendering.
DirectX by fundamental design (as in DX the API, not our backend) only uses the startInstance to offset the data you load but the SV_InstanceID parameter is not offseted. We emulate this elsewhere going as far as to actually subtract the baseinstance from the instanceID on Metal to make it look like DirectX. Technically what is happening on Vulkan on our side is a bug, or alternatively it’s just platform dependent behaviour. You need to pass something else into the shader to tell that instanceID.
The Dozen (Vulkan on top of DX12) actually emulates this in a very interesting fashion. For every shader that uses the Spir-V instanceID it injects an extra buffer and for every drawcall it dispatches a compute shader that grabs the baseinstance data from that buffer → writes it into a separate one to give it as param. We cannot do that as our shaders go through FXC (or DXC) as is and after they’re done they’re done from DirectX perspective. Also it has more than slight performance penalty.
How would you then offset the InstanceIds, based on StartInstance or some other value? I’d like to index into one large buffer with all my per draw data, i.e. transform matrices.
Ideally i’d like to use the full potential of ExecuteIndirect, but i understand Unity’s API cant cater to a single Graphics API like that.
So if i understand correcltly, where it says in the manual pass in SV_DrawId, this is impossible? I’m quite confused by the UnityIndirect.cginc then.
Apologies for the necro. Start instance is supposed to be an offset into a bound instance step rate buffer. Support for this concept is not exposed by Unity. For reference: