Something like the below code might end up faster than the if statements, especially on some older hardware.

```
fixed4 data = tex2D(_DataTex, UV.xy);
half channel = floor(fmod(UV2.x,4.0));
fixed output = dot(data, saturate(-abs(channel - half4(0.0, 1.0, 2.0, 3.0)) + 1.0));
```

A little explanation:

The first line is self explanatory, get the texture data.

The second line gets us a value of 0, 1, 2, or 3 by flooring the fmod.

The third line is the magic.

A dot product is a easy way to add a bunch of values together as it’s highly optimized on GPUs.

`foo.x + foo.y + foo.z + foo.w`

is slower than

`dot(foo, float4(1.0,1.0,1.0,1.0))`

A dot product on modern hardware can be done in a single cycle where the adds are all a single cycle each. Even on older hardware the dot product is probably going to be two cycles and not three.

So now the `saturate(-abs(channel - half4(0.0, 1.0, 2.0, 3.0)) + 1.0)`

part. This can probably best be explained by a wolfram alpha link.

http://www.wolframalpha.com/input/?i=-abs(x±+(0.0,+1.0,+2.0,+3.0))+++1.0+with+x+=+0+to+4

Basically it’s taking the value and getting them into 0 to 1 ranges (plus some negatives). Because channel is floored the values you get back are actually only zero or one, but adding floor to the wolfram alpha link makes it more difficult to understand. So now a channel value of 0 will result in a half4(1.0, 0.0, 0.0, 0.0) and a channel value of 1 will result in a half4(0.0, 1.0, 0.0, 0.0) etc.

The `saturate`

and `abs`

are both “free”, so that entire line is just 3 cycles even with the dot product. The second line of just `floor(fmod(UV2.x, 4.0))`

might be slower!