Hi everyone!
For quite some time we have been working on some deep performance improvements to how we do rendering and batching of draws for Unity. These improvements are designed out of the box to ‘just work’ with projects you have already created with URP and HDRP. We would like to share this work with you now so that we can get some feedback… We want it to be rock solid and work for all platforms that are capable of handling this improvement. What is described in this post is available in 2023.3a8
Background
About a year ago we introduced the reworked BatchRendererGroup API for Unity 2022.1, and while this API allows for some great performance benefits you either need to use Entities Graphics or do a lot of custom coding to put your objects into this API. We thought this was not good enough and we want many more Unity projects to be able to benefit from faster batching and performance without needing to be modified. So we decided to write a system to make this possible…
Garden Scene Running with the GPU Resident Drawer.
GPU Resident Drawer
In the latest 2023.3 alpha we have landed a new rendering system which is called the GPU Resident Drawer. This is a ‘behind the curtain’, GPU driven, system that allows you to author your game using game objects and when processed they will be ingested and rendered via a special fast path that handles better instancing. The improvements you will see using this feature are dependent on the scale of your scenes and the amount of instancing you utilize. The more instanceable objects you render the larger the benefits you will see. This feature is specifically for standard MeshRenderes. It will not handle skinned mesh renderers, VFX Graphs, particle systems or similar effects renderers.
How to enable the GPU Resident Drawer
The system can be enabled within the HDRP or URP Render Pipeline Asset. You should find the option GPU Resident Drawer Mode. Selecting Instanced Drawing enables the feature, and you can also select if you want the feature to be enabled just in play mode or in edit mode as well.
URP:
HDRP:
Some specific settings also need to be set, there are UI affordances that will tell you if your project is not configured properly, the specifics are:
-
BatchRendererGroup variant stripping needs to be set to Keep All, otherwise stand alone player builds will not render the converted objects.
-
In URP you must be in Forward+ rendering mode.
-
Static batching should be turned off. This is not required, but with static batching off instancing will do a better job, which results in fewer draw calls.
-
Under Lightmapping Settings in Lighting Settings check Fixed Lightmap Size and uncheck Use Mipmap Limits, this is also not required but will also result in fewer draw calls.
The system supports dynamic changes to game objects - the conversion runs incrementally and will pick up newly created objects as well as changed objects. This happens once per frame after LateUpdate but before Rendering begins - this means that if you are moving objects during rendering (for example in the RenderPipelineManager.beginCameraRendering callback) they may have incorrect data when rendering happens. You will want to force these objects to NOT be rendered via the GPU Resident Drawer using the “DisallowGPUDrivenRendering” MonoBehaviour.
Finally, the system is also compatible with Umbra occlusion culling so if you are already using that in your projects you will continue to see that benefit.
Objects rendered via the GPU Resident Drawer show up in the frame debugger as ‘Hybrid Batch Group’. In the spaceship scene the full g-buffer is laid down using the GPU Resident Drawer path.
Feature Support
The GPU resident drawer is supported on the modern rendering backends within Unity - specifically anywhere compute shaders are enabled this functionality should work. When your project is running on a platform that does not meet the required hardware capabilities the rendering will fall back to the traditional, non GPU, pathway.
One further specific note: OpenGL and GLES are explicitly not supported. The GPU Resident Drawer will fall back to regular game object rendering on these rendering backends even if they support compute shaders.
When the feature is enabled some objects may still not render via the new path, in these cases they will draw via the regular rendering paths.
Specific cases that are not compatible:
- The light probe usage on the renderer is set to use proxy volume
- The renderer is affecting or is affected by real time global illumination
- The renderer has a MaterialPropertyBlock attached
- The shader used by the material is incompatible with DOTS Instancing
- The renderer has per instance rendering callbacks attached (OnRenderObject etc)
- The Gameobject has the DisallowGPUDrivenRendering component attached.
Compatibility Notes
Not all objects can be rendered using the GPU Resident drawer and you may need to manually mark some objects to not render via this path.
The situations we are aware of where you might need to do this:
-
You are using a ‘custom pass’ in URP and that custom pass does not support the dots keyword but the main material does. This can not be detected by the system and the custom pass will fail to render.
-
You are updating the transform on a per camera rendering basis. We update the objects in the GPU cache one time per frame right before the Unity rendering pipeline is executed. It is not recommended to update object positions while the render pipeline is activated (i.e. per camera) but if you must do this then you will need to mark the objects to not go via the GPU pipeline.
To force a GameObject to render via the GameObject path instead of the GPU resident drawer, add the new DisallowGPUDrivenRendering component to it.If you need to use this script for situations outside of those listed above please let us know why so we can improve the system or documentation.
Performance
The GPU Resident drawer is specifically a CPU time optimization and may change GPU performance characteristics; please read to the end of this section to understand more.
How much CPU time is gained varies depending on the content that is rendered. Specifically content with more instancing and similar will benefit more as less draw calls will need to be submitted to the GPU. From our testing we have seen some larger scenes benefit massively, halving the CPU frame time. Smaller scenes also tend to benefit but often only show marginal improvements.
Here we have some numbers from an internal test project, your scenes may differ so please profile on your own projects to be sure.
On the project in the Editor running on Metal we go from about 15ms main thread rendering and 31ms render thread time with the regular Game Objects path
CPU Time Improvements
GPU performance notes
GPU performance may be negatively affected by drawing using the DOTS Instancing variant and this will be different depending on the device the content is rendering on. This is due to how data is loaded by the shaders which is different when using this feature. This effect will be more prominent on lower powered mobile GPUs but in many cases is also offset by the reduced number of draw calls. We would love to hear your feedback on the performance when using this feature on the projects that you are developing.
Additional Notes
-
Culling might differ slightly as the culling code has been reimplemented in C#. If you notice issues here please report them.
-
Setting BatchRendererGroup variant stripping to Keep All will increase shader variant count for player builds. This means that you may have longer build times.
-
Lightmaps are handled differently when this code path is enabled, we use TextureArrays with a dynamic index in the shader to look up the lightmap to use. This will lead to increased GPU memory use for lightmaps when this feature is enabled. We are investigating how to improve this.
How Can I Provide Feedback
The best way to provide feedback is in this thread. Try this feature on your projects and report and issues or performance numbers here. If there are any bugs encountered or similar we’ll likely ask for a bug report but feel free to post here first.