Optimize this shader for mobile?

I’m using this simple hue and saturation shift shader and I’d like to make is as optimal for mobile as possible (Android and iOS) - any hints?

The shader:

Shader "Custom/Room/EnvironmentUnlitHSV"
{
    Properties
    {
        _MainTex ("Base (RGB) Trans. (Alpha)", 2D) = "white" { }
    }

    SubShader 
    {
     Pass 
     {
       Fog { Mode off }
       
CGPROGRAM
        #pragma vertex vert_img
        #pragma fragment frag
        #pragma fragmentoption ARB_precision_hint_fastest 
        #include "UnityCG.cginc"           

        uniform sampler2D _MainTex;

        //
        // hue shift 0..360
        //
        float _RoomHueShift = 0;

        //
        // saturation -1..1
        //
        float _RoomSatShift = 0;
       
        float3 hsv_shift(float3 RGB, float3 shift)
        {
           float3 RESULT = float3(RGB);
           float shiftX314_d_180 = shift.x*3.14159265/180;
            float VSU = shift.z*shift.y*cos(shiftX314_d_180);
            float VSW = shift.z*shift.y*sin(shiftX314_d_180);

            float shiftZ587 = .587*shift.z;
            float shiftZ114 = .114*shift.z;
            float shiftZ299 = .299*shift.z;

            RESULT.x = (shiftZ299+.701*VSU+.168*VSW)*RGB.x
                      + (shiftZ587-.587*VSU+.330*VSW)*RGB.y
                      + (shiftZ114-.114*VSU-.497*VSW)*RGB.z;
           
            RESULT.y = (shiftZ299-.299*VSU-.328*VSW)*RGB.x
                      + (shiftZ587+.413*VSU+.035*VSW)*RGB.y
                      + (shiftZ114-.114*VSU+.292*VSW)*RGB.z;
           
            RESULT.z = (shiftZ299-.3*VSU+1.25*VSW)*RGB.x
                      + (shiftZ587-.588*VSU-1.05*VSW)*RGB.y
                      + (shiftZ114+.886*VSU-.203*VSW)*RGB.z;
           
            return (RESULT);
        }

        float4 frag (v2f_img i) : COLOR
        {   
         float4 tex = tex2D(_MainTex, i.uv);
         
         float3 hsv;

          hsv.x  = _RoomHueShift;         
         hsv.y  = 1 + _RoomSatShift;
         hsv.z  = 1;

         tex.rgb = hsv_shift(tex.rgb, hsv);
         
         return tex;
        } 
ENDCG
     }
    }
}

The simple optimisation I see is using the correct type size in the fragment func, it’s currently using float’s everywhere and I’m pretty sure I can get away with half or fixed here. I’m trying to test using different types here but I’m somewhat confused, because:

If I change the frag() and hsv_shift() functions to use fixed values like so:

       fixed3 hsv_shift(fixed3 RGB, fixed3 shift)
        {
                   fixed3 RESULT = fixed3(RGB);
                   fixed shiftX314_d_180 = shift.x*3.14159265/180;
                    fixed VSU = shift.z*shift.y*cos(shiftX314_d_180);
                    fixed VSW = shift.z*shift.y*sin(shiftX314_d_180);

                    fixed shiftZ587 = .587*shift.z;
                    fixed shiftZ114 = .114*shift.z;
                    fixed shiftZ299 = .299*shift.z;

                    RESULT.x = (shiftZ299+.701*VSU+.168*VSW)*RGB.x
                        + (shiftZ587-.587*VSU+.330*VSW)*RGB.y
                        + (shiftZ114-.114*VSU-.497*VSW)*RGB.z;
           
                    RESULT.y = (shiftZ299-.299*VSU-.328*VSW)*RGB.x
                        + (shiftZ587+.413*VSU+.035*VSW)*RGB.y
                        + (shiftZ114-.114*VSU+.292*VSW)*RGB.z;
           
                    RESULT.z = (shiftZ299-.3*VSU+1.25*VSW)*RGB.x
                        + (shiftZ587-.588*VSU-1.05*VSW)*RGB.y
                        + (shiftZ114+.886*VSU-.203*VSW)*RGB.z;
           
                    return (RESULT);
        }

        fixed4 frag (v2f_img i) : COLOR
        {   
         fixed4 tex = tex2D(_MainTex, i.uv);
         fixed4 hsv;

          hsv.x  = _RoomHueShift;
         hsv.y  = 1 + _RoomSatShift;
         hsv.z  = 1;
         tex.rgb = hsv_shift(tex.rgb, hsv);
         
         return tex;
        }

This should fail for the hue shift since it’s specified in 0…360 (out side of the -2…2 range for a fixed) - what is going on here?

Also, what happens with the RHS arithmetic, what types are they during computation? I can’t seem to find a way to specify fixed literals - do I need to litter casts every where through the arithmetic there?

Why is this even working when the hue is outside of the type range?

I understand that it’s expensive to convert between types in a fragment func - but it’s not very obvious when this is happening…

Now, the type sizes aside, what other ways could this be optimised?

It would be awesome to be able to keep the result from the previous frame for the next frame - this shader is being applied to some static textures, ie, they only need to be recalculated when the hue or saturation is changed (very infrequently) - can this be saved and reused?

Lookup tables? Is this a feasible thing to do here? I’ve see some mention of using a texture3d lookup, anyone had any experience doing this in Unity?

Open to all ideas…

To make it function only when the value changes, you could just define:

float _preRoomHueShift = 1; //we want these to start as a diff value so the hue updates once first
float _preRoomSatShift = 1;
fixed4 _preColor;

And then in your Frag:

if(_HueRoomShift == _preHueRoomShift && _RoomSatShift == _preRoomSatShift)
{
       return _preColor;
}
else
{
      _preHueRoomShift = _HueRoomShift;
      _preRoomSatShift = _RoomSatShift;
        fixed4 tex = tex2D(_MainTex, i.uv);
         fixed4 hsv;

          hsv.x  = _RoomHueShift;
         hsv.y  = 1 + _RoomSatShift;
         hsv.z  = 1;
         tex.rgb = hsv_shift(tex.rgb, hsv);
         _preColor = tex;
         return tex;
}

I think this should work, I can’t remember if HLSL allows the && conditional. But give it a shot!

Hey Invertex,

Can that really work? the color will be different for very call to frag (for each pixel) but you’re only storing the value of the very first pixel (?) also, does the value of _preColor persist between frames?

I gave it a shot regardless and it is always black (like _preColor is always float4(0,0,0,1)).

Yeah you’re right. I’m still somewhat new to learning shader programming :stuck_out_tongue:

I’m not sure how you’d store the whole finale texture result to use next frame, it’d be interesting to learn though.

Precision specifiers don’t work on desktop. Desktop GPUs are powerful enough that they don’t even have the support for lower precision data types like that. So, if you want to make sure that your precision constraints aren’t too tight, you need to test it on the target platform. Even then, OpenGL ES only specifies the minimum precision for those specifiers. Some mobile GPUs might treat fixed precision as the half precision, or even use something in between.
Literals and constants are casted to the appropriate precision when the shader is compiled. You don’t need to worry about that.

As for optimizations, you should try to move all calculations that are the same for all frag shader invocations to the CPU side, using a script or a custom material editor. From looking at your code, there’s a lot of it, since the shift part is constant. All calculations that truly need to be done every pixel basically boil down to a single matrix multiply, or three dot products, like so:

fixed3 redCompShift = fixed3(
    shiftZ299+.701*VSU+.168*VSW,
    shiftZ587-.587*VSU+.330*VSW,
    shiftZ114-.114*VSU-.497*VSW
)

RESULT.x = dot(RGB, redCompShift);

where redCompShift can be precomputed in script and passed down to the shader as a parameter. Dot products are quite fast, so that should give you one speedy shader :slight_smile:

1 Like

Hey @Dolkar ,

Thanks for the detailed reply!

That explains why things were still working regarding the precision specifiers…

A huge thankyou for your suggested optimisation. I’ve implemented my ‘interpretation’ of what you suggested, is this what you were saying?

In my c# code I do:

public void set(AdjustmentValues a)
{
    activeSettings = a;

    float VSU = (1 + a.saturation) * Mathf.Cos(a.hue * 3.14159265f / 180);
    float VSW = (1 + a.saturation) * Mathf.Sin(a.hue * 3.14159265f / 180);

    Vector4 redCompShift = new Vector4(.299f + .701f * VSU + .168f * VSW, .587f - .587f * VSU + .330f * VSW, .114f - .114f * VSU - .497f * VSW, 0);
    Vector4 greenCompShift = new Vector4(.299f - .299f * VSU - .328f * VSW, .587f + .413f * VSU + .035f * VSW, .114f - .114f * VSU + .292f * VSW, 0);
    Vector4 blueCompShift = new Vector4(.299f - .300f * VSU + 1.25f * VSW, .587f - .588f * VSU - 1.05f * VSW, .114f + .886f * VSU - .203f * VSW, 0);

    Shader.SetGlobalVector("_RoomRedCompShift", redCompShift);
    Shader.SetGlobalVector("_RoomGreenCompShift", greenCompShift);
    Shader.SetGlobalVector("_RoomBlueCompShift", blueCompShift);
}

Then the shader becomes:

Shader "Custom/Room/RoomUnlitHSV"
{
    Properties
    {
        _MainTex ("Base (RGB) Trans. (Alpha)", 2D) = "white" { }
    }

    SubShader
    {
     Pass
     {
CGPROGRAM
        #pragma vertex vert_img
        #pragma fragment frag
        #pragma fragmentoption ARB_precision_hint_fastest
        #include "UnityCG.cginc"    

        uniform sampler2D _MainTex;  
      
        fixed4 _RoomRedCompShift   = fixed4(1,0,0,0);
        fixed4 _RoomGreenCompShift = fixed4(0,1,0,0);
        fixed4 _RoomBlueCompShift  = fixed4(0,0,1,0);

        fixed3 hsv_shift(float3 RGB)
        {
           fixed3 shifted = fixed3(RGB);                  
            shifted.x = dot(RGB, _RoomRedCompShift.rgb);
            shifted.y = dot(RGB, _RoomGreenCompShift.rgb);
            shifted.z = dot(RGB, _RoomBlueCompShift.rgb);
            return shifted;
        }

        fixed4 frag (v2f_img i) : COLOR
        {                            
           fixed4 tex = tex2D(_MainTex, i.uv);  
           tex.rgb = hsv_shift(tex.rgb);
            return tex;          
        }   
ENDCG
    }
  }
}

I’ve used fixed4’s for the pre-computed shift values since I could not find a Shader.SetGlobalVector3 equivalent (is there some other trick?)

I’m I paying a price calling .rgb on each of those vectors? (I assume that must convert it to a fixed3 ??)

So far this works perfectly - so thanks again!

Also, for some reason, the default values:

fixed4 _RoomRedCompShift   = fixed4(1,0,0,0);
fixed4 _RoomGreenCompShift = fixed4(0,1,0,0);
fixed4 _RoomBlueCompShift  = fixed4(0,0,1,0);

Don’t seem to have any effect… I’d like these values to be used if Shader.SetGlobalVector() is never called.

There are no overloads for other vector types because they are not necessary. You can pass a Vector3 to SetGlobalVector() and it’ll get automatically coverted to a Vector4. In the shader, the type does not have to match exactly. a fixed3 will only take the first three components of the vector and int2, for example, would take only the first two components and cast them to an integer. This is done on the Unity / driver side and has no extra cost. Swizzling (.rgb) is also, in most cases, free in shaders.
Unity seems to ignore default values defined in that way… thats what the Properties block is for, but I’m not sure if it works for global shader values. The safest bet is to handle this in your script, if possible.

Cheers…

Yes I just found a thread explaining that I’d need to explicitly set the global shader params - no issue!