GPU Instancing in the 2d renderer?

I’m making a game where you can shoot asteroids, and a lot of small asteroid debris fly out. They are just 3-4 different models, but they vary slightly in color and size. That seems ideal for GPU Instancing.

I made a material with the shader URP/2D/Sprite-Lit-Default and enabled the GPU Instancing checkbox and assigned this to the sprite renderer of the asteroid debris.

But when I clone a gameobject, the batches still increase. Isn’t that the same as draw calls?

So am I doing this wrong? :slight_smile:

I don’t think the default sprite shader supports gpu instancing. I’m not completely sure though. I’d be interested in the answer to this.

1 Like

Interesting, I just started messing with shader graph, but have no idea if that’s something that could be made in there :slight_smile:

Doubtful. The shader graph trades complexity and optimization in favor of abstraction and ease of use. I could be completely wrong about the default sprite shader supporting instancing as internally, sprites are represented by simple quads.

That said, I don’t think the tilemap system has this issue as there is a single renderer on each tilemap. Any of the Unity 2d staff want to chime in? @rustum

Tilemaps should have great use of gpu instancing i guess. With the default URP>2d material there is a GPU instancing checkbox, but with my own shader graph material there isn’t.

Last word from a Unity dev for Tilemap GPU Instancing support ( Jan 31, 2020 ):[quote=“rustum, post:2, topic: 771724, username:rustum”]
[GPU Instancing for Tilemaps] isn’t something we’re actively working on, but we’ll put it on our list of ideas with potential. The suite of 2D tools in Unity has been focused on functionality previously and we are now turning our attention to improved performance and better workflows!
[/quote]
Maybe there’s an update @rustum could share about 2D system optimizations Unity is working on?

Hi @Lo-renzo and @spryx !
We will share some information soon on all the performance and workflow improvements that we are making. As for the specific topic of Tilemap GPU Instancing, I’ll let @ChuanXin respond with more detail.

2 Likes

We are looking into improving performance with the Tilemap and TilemapRenderer. This includes using GPU instancing for the TilemapRenderer. Depending on the situation with GPU instancing, there are issues which we need to look into which include sorting of the different Sprites used within the Tilemap and the sizes of the parameters passed in for instancing compared to standard batched rendering.

If there are particular use-cases regarding this, do let us know as well so that we can handle this better!

It does support GPU instancing. You will need to make a new material with the default sprite shader and activate GPU instancing. This should be the same for the URP and the default lit shader as well.

5929238--634058--upload_2020-6-2_15-53-36.png

Batches are the groups of draw calls that can be done together. I am not certain about your GameObject/Renderer, but assuming that everything can be instanced together, it is still possible that there are enough items where more draw calls are required. Using the Frame Debugger can help identify with that.

If it does not quite work out for you, do let us know about it! Screenshots and reproduction examples will help as well!

2 Likes

Thank you @ChuanXin and @rustum . I will try to reproduce this in a simpler project when I have the chance.

Is there anything special to consider when using a tilemap where each tile is 512x512? We’re making a cartoon like game, which has quite a high resolution, but if it doesn’t work, we would need to scale it down. I guess GPU instancing here would save some performance at least.

I am not fully certain if GPU instancing would help with high texture resolutions. Also, high texture resolutions do result in less efficient usage of Sprite Atlasing, which is a good technique that can help with performance too. It would be interesting to know what the size of a Tile relative is to the screen, and the general amount of Tiles shown each frame too.

For performance, it would be great if you could share some profiler screenshots for your project! I can understand if you want to keep things related to your product private also!

1 Like

@ChuanXin , @rustum
Can we please have some information on drawcall batching with Sprite rendering especially in accordance with :

  1. URP
  2. SRP batching
  3. GPU instancing
  4. 2D Renderer
  5. Sorting Group component + sorting order / layer ( does this affect batching at all ? )

Assuming that dynamic batching is running and the Renderers are rendered using the Transparent render queue (eg. 2D Renderers with the default Sprite shader):

  1. GPU instancing

  2. 2D Renderer

  3. Sorting Group component + sorting order / layer ( does this affect batching at all ? )
    Yes, it does! Renderers are batched based on the order they are sorted if their batching criteria is fulfilled. The sorting guide can be found here (Unity - Manual: 2D Sorting). From this sorted queue of renderers, the Unity rendering pipeline will try to batch the next renderer in sequence if the criteria matches here (Unity - Manual: Draw call batching). If the next renderer does not match in terms of the criteria (for SpriteRenderers, this is generally due to having different Textures), the current batch will be sent to be drawn and a new batch will be started. If the current batch of Renderers have GPU instancing enabled (and there is more than one Renderer), they will be instanced on the GPU. If not, they will be batched on the CPU. The FrameDebugger can help a lot in identifying why Renderers do not batch.

  4. SRP batching
    Currently, this is not supported for the 2D Renderers (SpriteRenderer, SpriteShapeRenderer and TilemapRenderer).

  5. URP
    @yuanxing_cai @Chris_Chu can explain better regarding URP and 2D lights!

2 Likes

@ChuanXin Thank you! I haven’t reached a stage where we’re optimizing, so I’ll start a new thread in the future when we reach that point. :slight_smile:

Any updates on this? I have spent a lot time to figure out that rendering sprite is something very special from rendering mesh and sprite renderer not supports SRP batching even if SPRITE shader has SRP Batcher: compatible label in inspector, which is very confusing! I’m using 2020.3.22 unity.

1 Like

No updates on this. SRP Batching is not supported for the 2D Renderers (SpriteRenderer, SpriteShapeRenderer and TilemapRenderer).

2 Likes

It “just works”, I don’t know if it is optimal

Shader "Universal Render Pipeline/2D/Sprite-Lit"
{
    Properties
    {
        _MainTex("Diffuse", 2D) = "white" {}
        _MaskTex("Mask", 2D) = "white" {}
        _NormalMap("Normal Map", 2D) = "bump" {}
    }

    HLSLINCLUDE
    #include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Core.hlsl"
    ENDHLSL

    SubShader
    {
        Tags {"Queue" = "Transparent" "RenderType" = "Transparent" "RenderPipeline" = "UniversalPipeline" }

        Blend SrcAlpha OneMinusSrcAlpha
        Cull Off
        ZWrite Off

        Pass
        {
            Tags { "LightMode" = "Universal2D" }
            HLSLPROGRAM
            #pragma exclude_renderers gles gles3 glcore
            #pragma target 4.5

            #pragma vertex CombinedShapeLightVertex
            #pragma fragment CombinedShapeLightFragment
            #pragma multi_compile USE_SHAPE_LIGHT_TYPE_0 __
            #pragma multi_compile USE_SHAPE_LIGHT_TYPE_1 __
            #pragma multi_compile USE_SHAPE_LIGHT_TYPE_2 __
            #pragma multi_compile USE_SHAPE_LIGHT_TYPE_3 __

            #pragma multi_compile_instancing
            #pragma multi_compile _ DOTS_INSTANCING_ON

            struct Attributes
            {
                float3 positionOS   : POSITION;
                float4 color        : COLOR;
                float2  uv           : TEXCOORD0;
                UNITY_VERTEX_INPUT_INSTANCE_ID
            };

            struct Varyings
            {
                float4  positionCS  : SV_POSITION;
                half4   color       : COLOR;
                float2    uv          : TEXCOORD0;
                half2    lightingUV  : TEXCOORD1;
                UNITY_VERTEX_OUTPUT_STEREO
            };

            #include "Packages/com.unity.render-pipelines.universal/Shaders/2D/Include/LightingUtility.hlsl"

            TEXTURE2D(_MainTex);
            SAMPLER(sampler_MainTex);
            TEXTURE2D(_MaskTex);
            SAMPLER(sampler_MaskTex);
            TEXTURE2D(_NormalMap);
            SAMPLER(sampler_NormalMap);

            CBUFFER_START(UnityPerMaterial)
            half4 _MainTex_ST;
            half4 _NormalMap_ST;
            CBUFFER_END

            #if USE_SHAPE_LIGHT_TYPE_0
            SHAPE_LIGHT(0)
            #endif

            #if USE_SHAPE_LIGHT_TYPE_1
            SHAPE_LIGHT(1)
            #endif

            #if USE_SHAPE_LIGHT_TYPE_2
            SHAPE_LIGHT(2)
            #endif

            #if USE_SHAPE_LIGHT_TYPE_3
            SHAPE_LIGHT(3)
            #endif

            Varyings CombinedShapeLightVertex(Attributes v)
            {
                Varyings o = (Varyings)0;
                UNITY_SETUP_INSTANCE_ID(v);
                UNITY_INITIALIZE_VERTEX_OUTPUT_STEREO(o);

                o.positionCS = TransformObjectToHClip(v.positionOS);
                o.uv = TRANSFORM_TEX(v.uv, _MainTex);
                float4 clipVertex = o.positionCS / o.positionCS.w;
                o.lightingUV = ComputeScreenPos(clipVertex).xy;
                o.color = v.color;
                return o;
            }

            #include "Packages/com.unity.render-pipelines.universal/Shaders/2D/Include/CombinedShapeLightShared.hlsl"

            half4 CombinedShapeLightFragment(Varyings i) : SV_Target
            {
                half4 main = i.color * SAMPLE_TEXTURE2D(_MainTex, sampler_MainTex, i.uv);
                half4 mask = SAMPLE_TEXTURE2D(_MaskTex, sampler_MaskTex, i.uv);

                return CombinedShapeLightShared(main, mask, i.lightingUV);
            }
            ENDHLSL
        }

        Pass
        {
            Tags { "LightMode" = "NormalsRendering"}
            HLSLPROGRAM
            #pragma exclude_renderers gles gles3 glcore
            #pragma target 4.5

            #pragma vertex NormalsRenderingVertex
            #pragma fragment NormalsRenderingFragment

            #pragma multi_compile_instancing
            #pragma multi_compile _ DOTS_INSTANCING_ON

            struct Attributes
            {
                float3 positionOS   : POSITION;
                float4 color        : COLOR;
                float2 uv            : TEXCOORD0;
                float4 tangent      : TANGENT;
                UNITY_VERTEX_INPUT_INSTANCE_ID
            };

            struct Varyings
            {
                float4  positionCS        : SV_POSITION;
                half4   color            : COLOR;
                float2    uv                : TEXCOORD0;
                half3   normalWS        : TEXCOORD1;
                half3   tangentWS        : TEXCOORD2;
                half3   bitangentWS        : TEXCOORD3;
                UNITY_VERTEX_OUTPUT_STEREO
            };

            TEXTURE2D(_MainTex);
            SAMPLER(sampler_MainTex);
            TEXTURE2D(_NormalMap);
            SAMPLER(sampler_NormalMap);

            CBUFFER_START(UnityPerMaterial)
            half4 _MainTex_ST;
            half4 _NormalMap_ST;
            CBUFFER_END

            Varyings NormalsRenderingVertex(Attributes attributes)
            {
                Varyings o = (Varyings)0;
                UNITY_SETUP_INSTANCE_ID(attributes);
                UNITY_INITIALIZE_VERTEX_OUTPUT_STEREO(o);

                o.positionCS = TransformObjectToHClip(attributes.positionOS);
                o.uv = TRANSFORM_TEX(attributes.uv, _NormalMap);
                o.uv = attributes.uv;
                o.color = attributes.color;
                o.normalWS = TransformObjectToWorldDir(float3(0, 0, -1));
                o.tangentWS = TransformObjectToWorldDir(attributes.tangent.xyz);
                o.bitangentWS = cross(o.normalWS, o.tangentWS) * attributes.tangent.w;
                return o;
            }

            #include "Packages/com.unity.render-pipelines.universal/Shaders/2D/Include/NormalsRenderingShared.hlsl"

            half4 NormalsRenderingFragment(Varyings i) : SV_Target
            {
                half4 mainTex = i.color * SAMPLE_TEXTURE2D(_MainTex, sampler_MainTex, i.uv);
                half3 normalTS = UnpackNormal(SAMPLE_TEXTURE2D(_NormalMap, sampler_NormalMap, i.uv));
                return NormalsRenderingShared(mainTex, normalTS, i.tangentWS.xyz, i.bitangentWS.xyz, i.normalWS.xyz);
            }
            ENDHLSL
        }
        Pass
        {
            Tags { "LightMode" = "UniversalForward" "Queue"="Transparent" "RenderType"="Transparent"}

            HLSLPROGRAM
            #pragma exclude_renderers gles gles3 glcore
            #pragma target 4.5

            #pragma vertex UnlitVertex
            #pragma fragment UnlitFragment

            #pragma multi_compile_instancing
            #pragma multi_compile _ DOTS_INSTANCING_ON

            struct Attributes
            {
                float3 positionOS   : POSITION;
                float4 color        : COLOR;
                float2 uv            : TEXCOORD0;
                UNITY_VERTEX_INPUT_INSTANCE_ID
            };

            struct Varyings
            {
                float4  positionCS        : SV_POSITION;
                float4  color            : COLOR;
                float2    uv                : TEXCOORD0;
                UNITY_VERTEX_OUTPUT_STEREO
            };

            TEXTURE2D(_MainTex);
            SAMPLER(sampler_MainTex);

            CBUFFER_START(UnityPerMaterial)
            half4 _MainTex_ST;
            half4 _NormalMap_ST;
            CBUFFER_END

            Varyings UnlitVertex(Attributes attributes)
            {
                Varyings o = (Varyings)0;
                UNITY_SETUP_INSTANCE_ID(attributes);
                UNITY_INITIALIZE_VERTEX_OUTPUT_STEREO(o);

                o.positionCS = TransformObjectToHClip(attributes.positionOS);
                o.uv = TRANSFORM_TEX(attributes.uv, _MainTex);
                o.uv = attributes.uv;
                o.color = attributes.color;
                return o;
            }

            float4 UnlitFragment(Varyings i) : SV_Target
            {
                float4 mainTex = i.color * SAMPLE_TEXTURE2D(_MainTex, sampler_MainTex, i.uv);
                return mainTex;
            }
            ENDHLSL
        }
    }
}

Sorry for necroing this thread but it’s important to me. Was GPU instancing finally applied to TilemapRenderers?

1 Like

So, after reading around the forum, am I correct to say that:
Using a 2D renderer (e.g. SpriteRenderer) and a custom sprite shader, sprites cannot be batched if they have different properties?

Example:
I have a car sprite and a custom sprite shader that allows me to change the door colors. I want to render 100 cars with different door colors. Is there no way to batch this?

  • SRP Batcher is not supported
  • MaterialPropertyBlocks break batching in SRP

If I’m correct, are there any plans to support this in the future?

Added in 2023.1

  • 2D: Added SRP Batching for 2D Renderers and Particle Renderer to support URP.
7 Likes