GPU Instancing (Dynamic Batching) Not Working on Oculus Quest 2 (Android). Works fine in-editor.

I have a VR scene that has a grid of 1024 identical meshes that move dynamically. I have GPU instancing enabled on the material, and in-editor I have verified that this is working fine. However, when I compile an Android build for the Oculus Quest, GPU instancing doesn’t work and each mesh is rendered in its own individual draw call, killing my frame rate. I have verified this with RenderDoc.

I am using Unity 2019.4.16f1 with the URP. Dynamic batching is turned on in my URP Asset:

6595000--749674--DynamicBatchingEnabled.png

In-editor there are 40 draw calls:

On the Oculus Quest 2, there are 1058 draw calls, verified via RenderDoc:

6595000--749662--RenderDoc_DrawCalls.png

Here is the frame in RenderDoc showing each mesh being rendered individually instead of instanced:

Does anyone know why this is happening? Any help getting GPU Instancing / dynamic batching working on the Quest would be greatly appreciated!

2 Likes

So as it turns out, having the SRP Batcher enabled appears to break GPU instancing on the Quest (using Unity 2019.4.16f1 with URP 7.3.1). When I turned off the SRP Batcher, then the GPU instancing works properly on the Quest and I get a very small number of draw calls and a smooth framerate.

@aleksandrk Do you know what the current state of the SRP batcher is for Quest (Android)? Is it broken? It’s disheartening that I need to turn it off in order to get good performance on the Quest, since the SRP batcher appears to have a lot of performance benefits.

2 Likes

I would suggest reporting a bug - this would ensure relevant people look at this.

Did someone found workaround or something?!
Same problem here, on unity version 2020.3.1f1,
using URP version 10.3.2.
On editor im getting 17 SetPass calls and 20 batches, on quest 2 im getting 68 batches and the frame rate is very bad.

1 Like

Similar problem here, but our problem is with static objects not being batched on android using the SRP batcher (quest 2)

You can still use GPU instancing if you enable it manually by doing calls to the Graphics API directly.

Graphics.DrawMeshInstanced() - This has a limit of 1023 objects for one call. You can make multiple calls to draw more or use the next method instead
Graphics.DrawMeshInstancedIndirect() - A bit more complicated to use because you have to deal with Compute Buffers, but this way you can batch a shitton of objects into a single draw call (I tested it with around 2 million grass quads at steady 60 fps in the editor).

You have to keep in mind that you need the support for these calls to your used shaders as well. Especially if you are using Shader Graph this is hard to do but not impossible (it needs some unconventional custom node usage to add support for arrays (DrawMeshInstanced) or StructuredBuffers (DrawMeshInstancedIndirect)).

Here is an old demo in which I am using multiple DrawMeshInstanced() calls to use GPU instancing on the Oculus Quest 1 and I am pretty sure that I could increase the performance with higher number of rendered objects with DrawMeshInstancedIndirect():
https://www.youtube.com/watch?v=fJIvyZo_T1U

1 Like

Hi Desox

Are you able to share this project somewhere?
The issue I’m having using Graphics.DrawMeshInstancedIndirect() is the CPU is sitting at abut 20% while the GPU is getting hammered and the frame rate is 20fps - Unity 2019.4.20 on Quest 1

Hey @prawn-star , unfortunately no, not yet. But regarding your issue: Are your sure that your are really instancing t the drawn meshes? So, did you write your shader to support instancing in a way that you can draw them with DrawMeshInstancedIndirect() (StructuredBuffer in shader, Compute buffers set up in C# etc.)?

Hey Desox

Yep pretty sure we really instancing them. DrawMeshInstancedIndirect draws them with all the correct transforms and number of instances. They are not put into the scene and Frame Debug has them as Instanced.
I’m just wondering if it’s worth actually putting them in the scene as objects with GPU Instancing in the shader flicked on. (They animate into position so we can’t mark them as static)
Would that help with the GPU performance?

Unfortunately, if you are on URP, putting them into the scene with activated GPU instancing won’t work, as it is not supported in URP if you also have enabled the SRP batcher.

The reason why your GPU has so much pressure cand be several reasons, one of which is the shader you are using.
If you can see them being instanced in the frame debugger, it looks like its working at least in the editor.
Is your shader written in a way that it is supported on the quest as well? Where is the bottleneck of your shader, does it have lots of vertex calculations?
In my tests I also saw a lot of issues when I did more than some basic vertex calculations. Other than PC GPUs mobile GPUs (tile based) like the one on the quest have a bit more issues with the vertex stage to my knowledge.

The same problem with the Android build. Editor batching works in the editor, but not in the build.
Unity 2022.1.3 URP