I realise that this is a topic that may have been discussed here many times already, and for that I sincerely apologise for bringing it up again, but I’ve been doing some experiments recently on comparing the SRP Batcher against GPU Instancing and I have some thoughts on the matter that I would like to share.
The reason for my investigation in the first place was spurred by the fact that I need to render many meshes that also require per instance material data. One of the well known methods to set per instance data, in an efficient manner, is to use a MaterialPropertyBlock. However, as mentioned in many threads, and in the Unity Manual docs itself, using a MaterialPropertyBlock breaks the SRP Batcher by making it to revert to individual draw mesh calls. To remedy this, I discovered that the only method in which per instance material data can be set whilst still using the SRP Batcher involves creating an instance of the material and apply the material update there.
I ended up performing two tests. For both tests I created a script that would instantiate 10,000 cube meshes that were all evenly spread out throughout the world. Every four seconds, I would update a colour material property for each cube instance with a random colour. For the first test, I would use the SRP Batcher with the material clone technique to update the individual material property. For the second test, I used a shader that added an Instanced Property, which in turns disables the SRP Batcher compatibility, and enabled GPU Instancing. The instanced property was updated via a MaterialPropertyBlock. The results of the tests (performed on a Samsung Galaxy S7 International Edition, which uses the Exynos CPU) were as follows:
SRP Batcher:
Render Opaque Main Thread TIme: 18.37ms
Render Opaque Render Thread Time: 82.05ms
GPU Instancing:
Render Opaque Main Thread Time: 8.55ms
Render Opaque Render Thread Time: 7.88ms
As you can see, a very stark difference!
After an extensive search on the topic on the forums, the general opinion is that the SRP Batcher is intended to be a replacement for all existing render paths, which includes GPU Instancing. This is somewhat evidenced by the fact that GPU Instancing and SRP Batcher are not compatible. However, my results show that GPU Instancing has its uses from a performance perspective. The other issue being the memory usage of instancing materials for per mesh instance material data. In my test I generated 26mb of material data since there 10,000 instances of the materials generated.
The ideal scenario here is that a combination of the two technologies are used, in that if there is a single instance of a mesh used then it should use the SRP Batcher, and then as soon as there are multiple instances of a mesh, it switches to using GPU Instancing. However, given the different shader requirements that are needed for both paths, the only way this can be done, as far as I know, is by managing the switch yourself, most likely via a shader switch.
So the main question I have on this is: are my thoughts on the SRP Batcher/GPU Instancing usage correct? Also, am I correct with regards to method in which I set per instance data with the SRP Batcher? Or is there better method that I am not aware of? Lastly, are there any upcoming changes that help to improve the performance of instanced meshes with the SRP Batcher? (The last question is mostly directed to anyone in Unity).
Curious to hear what others think of this. Many thanks!