ShaderModel 3.0 Texture accesss inside dynamic conditional 'if' blocks

Hey!

I was just working on a special specular shader and came across a problem when adding the #pragma target 3.0 statement.

When compiling the shader without #pragma target 3.0, no errors occur. But when adding this option I get the following error message:

The following example shows the same problem with the help of the first Surface Shader Lighting Model Example.

 Shader "Example/Diffuse Texture" {

    Properties {
      _MainTex ("Texture", 2D) = "white" {}
      _Ramp ("Texture", 2D) = "white" {}
    }

    SubShader {
      Tags { "RenderType" = "Opaque" }

[COLOR="green"]      CGPROGRAM
      #pragma surface surf SimpleLambert
      #pragma target 3.0[/COLOR]

      sampler2D _Ramp;

      half4 LightingSimpleLambert (SurfaceOutput s, half3 lightDir, half atten) {
          half NdotL = dot (s.Normal, lightDir);

          half4 c;
          c.rgb = s.Albedo * _LightColor0.rgb * (NdotL * atten * 2);
          c.a = s.Alpha;
        
          [COLOR="red"]if(NdotL > 0.5)
         	c.rgb += tex2D(_Ramp, half2(NdotL, 1)).rgb;[/COLOR]

          return c;
      }


      struct Input {
          float2 uv_MainTex;
      };

      sampler2D _MainTex;

      void surf (Input IN, inout SurfaceOutput o) {
          o.Albedo = tex2D (_MainTex, IN.uv_MainTex).rgb;
      }

      ENDCG
    }
    Fallback "Diffuse"
  }

As far as I know this problem didn’t occur in Unity 2.6, but not exactly sure. Any ideas?

Thanks,
Jan

PS: Shader Model 3.0 is not required in this example, I know. :wink:

Actually the same problem was occurring in 2.6 as well, just the shader loading error was not printed to the Unity console. So most likely the shader was not actually working, but there was no clear indication about that.

Now, about the actual problem: you can’t do regular texture accesses inside a “dynamic branch”. Dynamic branch is when the condition might change between different pixels. This is not Unity specific; it’s just the way GPUs work (technical reason: since neighboring pixels can result in different code paths, the GPU has no way to determine which mipmap levels to use).

So the workarounds could be several:

  1. Do not do texture read inside a branch. Instead, replace code with something like (typing out of my head):
half3 ramp = tex2D (_Ramp, half2(NdotL,1)).rgb;
half factor = NdotL > 0.5 ? 1.0 : 0.0;
c.rgb += ramp * factor;

This in almost all cases will be faster than doing a real dynamic branch as well. Sometimes much faster.

  1. Do not use tex2D, but supply your own mip level instead, via tex2Dlod or similar. This might not work when compiling shader for OpenGL, unless you use “#pragma glsl”
1 Like

Oh, why don’t I get a notifications by email anymore? Anyway, thanks for your reply!

That’s really strange, because the result looked ok in 2.6 and the shader was definitely used. Is it possible that the tex2D instruction is somehow replaced with a constant then?
Right now (using Unity 3.1) the shader is just black when the error occurs.

For your workarounds:

  1. The thing is, that I perform quite a lot of work inside the dynamic branch, so I am not sure if it’s really faster removing the branch.

  2. Using the tex2Dlod(_Ramp, half4(uv, 0,0)) seems to work fine… it contains a precalculated math function, so I don’t need mip levels anyway. Thanks, really helped me a lot!

the code in the branch will always execute in shaders. The if only decides whichs branch parts solution is taken at the end.
For that reason IFs don’t save you performance, the opposite actually, they only allow you to make shaders with differing effects depending on a value work with far less complex shaders

This is not true.

On a true dynamic branching shader model and GPU (i.e. shader model 3.0), dynamic branches are actually skipped, if whole “block of pixels” (depending on the GPU) take the same code path. E.g. imagine 32x32 region of pixels on screen - if all of them go the same path, then the branches are properly skipped. If any of them take different paths, that whole region starts executing both sides of branches, and masks out the calculations at the end.

That “32x32 block” was just an example; the actual block size varies between different GPUs. But in general, dynamic branching is best used for things that usually take the same code path over large portions of screen.

You can read texture inside branch, but only with direct texture coordinats i.e. - tex2D(_SomeTexture, IN.SomeUV);

Thats possible that there is such a optimization for larger areas, don’t know about that.

But I know from CUDA programming, that IF in GPU programming is in no way comparable to IF in traditional programming and that careless missuse of branching kills more performance than it saves as stream processors are not exactly optimizable for branching (they are meant to push through data streams thus stream processor, they are far less opted for things that require data lookup for evaluation as waiting on a GPU impacts whole areas of the GPU at worst (depending on how many CUDA threads running for example in CUDA))

There are cases where usage of IF can be a good idea, but branching around for 1 line definitely costs more than it helps as a single vector op is a single tick instruction.
Using mathematical branches ie value dependent factors that kill out variables in an equation as you’ve shown in your example, will for such situations always be faster

For anybody else having error X6077, it can be fixed by setting the lod level explicitly: change the tex2D instruction giving the error (this has been mentioned in passing above, but here is sample code)

tex2D (A, B)

for the explicit

tex2Dlod (A, float4 (B, 0, 0))

The error will dissappear

1 Like

hi there,

i have just written a shader using #pragma target 3.0 and heavy texture lookups (simply using: tex2D ) in conditional “if” blocks – and it worked quite well under os x.
it even work on windows – at least in the webplayer: http://bit.ly/OZ0yRp
in order to make sure that the sahder works correctely compare the webplayer with the following picture. the result in the webplayer should look lik the image on the right.

although it seems to run under both mac and win but it can’t be compiled under windows… which i guess is pretty strange.
i rewrote the shader to make all texture lookups outside the if branches – but now it looks up textures even if it does not need them which will cost a lot of time i guess.

any idea anybody?

lars

Aras, old post that rocks! :slight_smile:

When I upgrade my project from Unity5.5.2 to Unity5.6.5p2 I get the above error, your solution fix it. I don’t know shader program so I don’t know why. Thanks.

Great , I.m using Unity5.6 too
It seems to be fixed as below

float ClipTex = tex2Dlod(_DissolveSrc, float4(IN.worldPos.x,IN.worldPos.y,0,0)/_Tile).r * tex2Dlod(_DissolveSrc, float4(IN.worldPos.y,IN.worldPos.z,0,0)/_Tile).g;