Instanced rendering with texture arrays causing lag at certain zoom levels

I’ve got a problem I just can’t work out here. In our game, we’re rendering an isometric tile map, generally 50x50 tiles, with multiple possible layers - so around 2500-5000 quads to render the entire map. This would obviously run terribly without instancing, so I set it up to use a single texture array to render all the sprites (118 sprites currently), with each quad receiving an instanced _SpriteIndex property to point to an entry in this texture array.

This works perfectly fine zoomed out - the entire board can render at a smooth 60FPS. It also works fine close up - though it does drop below sixty. In the middle of those two zoom levels, however, it can slow down to even single digit FPS numbers. The GPU time per frame spikes, GPU usage hits 100%, and the game slows to a crawl, even with no change in number of draw calls or triangles drawn. I thought this problem could be mipmapping, but disabling mipmapping seems to have no effect.

Here is a version of the shader we are using, with only the necessary portions:

Shader "Sprites/MinimalExample"
{
    Properties
    {
    }

    SubShader
    {
        Tags
        {
            "Queue" = "Transparent"
            "IgnoreProjector" = "True"
            "RenderType" = "Transparent"
            "PreviewType" = "Plane"
        }

        Cull Off
        Lighting Off
        ZWrite Off
        Blend One OneMinusSrcAlpha

        Pass
        {
            CGPROGRAM

            #pragma vertex SpriteVert
            #pragma fragment SpriteFrag
            #pragma target 2.0
            #pragma multi_compile_instancing
            #pragma require 2darray

            #include "UnityCG.cginc"

            UNITY_INSTANCING_BUFFER_START(PerDrawSprite)
                UNITY_DEFINE_INSTANCED_PROP(float, _SpriteIndex_Instanced)
            UNITY_INSTANCING_BUFFER_END(PerDrawSprite)

            #define _SpriteIndex    UNITY_ACCESS_INSTANCED_PROP(PerDrawSprite, _SpriteIndex_Instanced)

            UNITY_DECLARE_TEX2DARRAY(_MainTexArray);

            struct appdata_t
            {
                float4 vertex   : POSITION;
                float2 texcoord : TEXCOORD0;
                UNITY_VERTEX_INPUT_INSTANCE_ID
            };

            struct v2f
            {
                float4 vertex   : SV_POSITION;
                float2 texcoord : TEXCOORD0;
                UNITY_VERTEX_INPUT_INSTANCE_ID
            };

            v2f SpriteVert(appdata_t IN)
            {
                v2f OUT;

                UNITY_SETUP_INSTANCE_ID(IN);

                OUT.vertex = UnityObjectToClipPos(IN.vertex);
                OUT.texcoord = IN.texcoord;

                UNITY_TRANSFER_INSTANCE_ID(IN, OUT);  

                return OUT;
            }

            fixed4 SpriteFrag(v2f IN) : SV_Target
            {
                UNITY_SETUP_INSTANCE_ID(IN);
                float4 c = UNITY_SAMPLE_TEX2DARRAY(_MainTexArray, float3(IN.texcoord, _SpriteIndex));
                c.rgb *= c.a;
                return c;
            }

            ENDCG
        }
    }
}

Replacing _SpriteIndex with a constant makes the issue go away, so it seems that sampling from the texture array is the root of this issue. Does anyone know what the problem is, or have any leads I can follow?

This is Unity 2018.4.1f1.

What platform does this happen on?

I’ve only tested it on Windows, and it seems to happen across rendering APIs - D3D11, Vulkan, and OpenGL tested with the same issue.

I’ve been slowly narrowing down the problem and I’ve found that, at the exact zoom level that the issue starts occurring, the textures get substantially sharper. That is, at a size of 8.6615 the textures are blurry and the game runs well, and at 8.6614 the textures are sharp and run terribly. So it’s something to do with mipmapping after all…?

Try making your texture array way smaller, like 4x4x<num_layers>. This will make sure it hits the cache at least most of the time.

Using half-sized textures for the texture array seemed to work (128x512 instead of 256x1024). Is there a way to solve this that doesn’t reduce texture quality or should we only use the full-sized textures for higher-end GPUs?

How do you setup the texture array?

public static Texture2DArray GetTextureArray()
{
    if(_texArrayDirty || _texArray == null)
    {
        // _nextId starts at zero and is incremented every time we load a tile sprite so it is equivalent to the number of tiles
        _texArray = new Texture2DArray(TA_WIDTH, TA_HEIGHT, _nextId, TextureFormat.ARGB32, true);

        // for each tile id we've loaded from disk
        foreach(var id in _tileIds.Values)
        {
            if(_loadedTiles[id].width != TA_WIDTH || _loadedTiles[id].height != TA_HEIGHT)
            {
                Debug.Log($"invalid sprite size: {_loadedTiles[id].name}");
            }

            // add its texture to that layer in the texture array
            var pixels = ((Texture2D)_loadedTiles[id]).GetPixels32();
            _texArray.SetPixels32(pixels, id);
        }

        _texArray.Apply();
        _texArrayDirty = false;
    }

    return _texArray;
}

It looks fine…
Can you please submit a bugreport?

Hi, did you ever solve this?
I am encountering a similar issue.
I want to draw impostors using instancing: one quad per impostor, accessing a texture inside a texture array.
When zooming out everything is smooth, but when getting really closer, the sampling of the texture2DArray seems to kill the performance in the fragment shader.
I understand that zooming in means more fragments executing the fragment shader, but I thought there should be a cache or something that improves performance.
I am running this on an Oculus Quest by the way, on PC it seems to work fine.

Zooming in also means higher resolution mip-levels of your texture get streamed in, taking up memory. There could be a bottleneck happening.
The other thing to keep in mind is if you’re rendering with semi-transparency and thus experiencing a lot of over-draw when you get zoomed in compared to zoomed out where each impostor is mostly covering its own unique screen-space instead of overlapping with dozens or hundreds of others. There is an “Overdraw” mode you can set the Unity viewport mode as to visual this.

Hi,
Thanks for your answer.
I don’t completely understand the memory bottleneck you are mentiong.
Could you point me to any reference to better understand this problem? Is it related to texture magnification and how the texture is filtered when doing so?
In any case, we are not using transparency, but alpha-cut (discarding fragments based on alpha).