Why must i do that? (float2 uv = i.screenPos.xy / i.screenPos.w)

This is script of my shader:
Shader “Custom”
{
Properties
{
_MainTex(“Tex”, 2D) = “white”{}
}
SubShader
{
Tags { “RenderType”=“Opaque” }
LOD 100
Cull Off

Pass
{
CGPROGRAM
#pragma vertex vert
#pragma fragment frag
#include “UnityCG.cginc”

struct appdata
{
float4 vertex : POSITION;
};

struct v2f
{
float4 vertex : SV_POSITION;
float4 screenPos : TEXCOORD0;
};

sampler2D _MainTex;

v2f vert (appdata v)
{
v2f o;
o.vertex = UnityObjectToClipPos(v.vertex);
o.screenPos = ComputeScreenPos(o.vertex);
return o;
}

fixed4 frag (v2f i) : SV_Target
{
float2 uv = i.screenPos.xy / i.screenPos.w;
fixed4 col = tex2D(_MainTex, uv);
return col;
}
ENDCG
}
}
Fallback “Standard”
}

Why must i do that “i.screenPos.xy / i.screenPos.w;”
And what do “i.screenPos.w” mean?

The divide by w is known as “the perspective divide”. The reason it exists has to do with how perspective projection and homogeneous clip space work. That UnityObjectToClipPos() function takes the local object space vertex positions and transforms them into world space, and from there into clip space using a combined view (camera relative) and projection (view frustum) matrix. Clip space is essentially a four dimensional screen space position, where the x and y are in a -w to +w range, and the z is in a -w to +w or +w to 0 range (depending on if it’s OpenGL or not). So what is w? For a perspective camera, the w is the view space depth.

But, why?

The reason has to do with interpolating values in screen space. When a triangle is rendered on screen, the values the fragment shader receives from the vertex shader are interpolated in screen space from the values in the 3 vertices of that triangle based on the position within the triangle that pixel appears. Normally for things like texture UVs you want to account for the perspective, otherwise you get weird warping and stretching like in a PS1 game. Basically if you don’t account for the perspective, the UVs will be interpolated as if they’re flat screen facing triangles rather than than a surface angled away from the viewer. Homogeneous clip space coordinates allow the GPU to understand that perspective and correct for it in interpolated values.

But when you’re trying to do screen space UVs, you don’t want that perspective correction. You want the values to be interpolated like they’re facing the screen. To recreate something like the above example (cribbed from wikipedia) but for screen space projection, it’d look something like this:


In this example, the “perspective correction” option in the center is doing the same thing as the “correct” example in the previous image! This is what you’d get if you divide by w in the vertex shader first. But if we pass homogeneous coordinates from the vertex to the fragment and do the perspective divide in the fragment you undo the perspective correction the GPU is doing. So that’s what the the ComputeScreenPos() function does. It is taking the original clip space position and just rescaling & offsetting the xy values so they’re from 0.0 to +w, and then leaving the w unchanged from the clip space. It’s passing the homogeneous screen position coordinates from the vertex to the fragment shader.

TLDR: It prevents weird warping caused by the GPU trying to do perspective correction on coordinates you don’t want perspective corrected.

2 Likes