URP - Understanding _ScaledScreenParams and homogeneous clip space position

Hello,

I have a shader in URP that is working, but I do not understand why it is working. I compute homogeneous clip space position using TransformObjectToHClip function. Then I want to sample a render texture that was created in other shader. This texture stores object IDs as a color so it is a full screen texture. I am using code from a tutorial here Reconstruct the world space positions of pixels from the depth texture | Universal RP | 13.1.9

float2 uv = IN.positionCS.xy / _ScaledScreenParams.xy;

This works and and samples the texture correctly. But why? I thought the clip space xy is from -w to w. From the tutorial => The property _ScaledScreenParams.xy takes into account any scaling of the render target, such as Dynamic Resolution. And its xy should be render target resolution. So how do I got the UV in 0,1? I need to understand this before working further, it does not matter that it is working (although I am happy it does:-)).

Thanks for answer.

1 Like

The one small thing you’re missing is the fact SV_POSITION is special.

For all values passed between the vertex and the fragment shader, the values are not directly modified apart from the perspective correct barycentric interpolation that gets you the value in the fragment shader. All, except for SV_POSITION which behaves completely differently.

As you accurately understand, the value calculated in the vertex shader is the homogeneous clip space position of that vertex. That value is used to calculate the pixel screen space position for each vertex for rasterization by dividing the xyz by w, rescaling the resulting -1 to 1 range of xy to 0.0 to the target resolution. And that’s the value that is then interpolated and passed to the fragment shader.

So, in short, the SV_POSITION the fragment shader receives is:
xy = the pixel position of that fragment
z = the non-linear z depth
w = the w set in the vertex shader, which happens to be the camera space depth for perspective rendering and 1.0 for orthographic rendering

So dividing by _ScaledScreenParams.xy, which is the current resolution (possibly scaled by dynamic resolution which can use only a portion of a frame buffer) gets you a 0.0 to 1.0 range useful for UVs.

4 Likes

Thank you so much!
I have still that stupid c# mindset and do not consider that the pipeline executes steps between vertex and fragment shader and it can modify the data.

Thank you Bgolus. Do you mean only the value with SV_POSITION semantic get modified by perspective correct barycentric interpolation?(Update: no. I misunderstood the whole thing…)
If I add an positionCS2 to the struct Varyings, with TEXCOORDn sematic, then can I calculate the screenUV in the fragment shader by
【float2 screenUV = (i.positionCS2.xy/i.positionCS2.w)*0.5+0.5;】?
Why it doesn’t work? (Update: it works but it is flipped upside down)
Meanwhile I can pass the GetVertexPositionInputs().positionNDC from vertex shader to fragment shader, and calculate the screenUV in the fragment shader by
【float2 screenUV = i.positionNDC.xy / i.positionNDC.w;】.
Why it works but the previous positionCS2 doesn’t work?

Update:
Thanks for Bgolus’s reply. Here is my correction.
If I add an positionCS2 to the struct Varyings, with TEXCOORDn sematic, then I can calculate the screenUV in the fragment shader by
【float2 screenUV = (i.positionCS2.xy/i.positionCS2.w)*0.5+0.5;】
but it is flipped upside down, so indeed it should be:

i.positionCS2.y *= _ProjectionParams.x; // positionCS2 with TEXCOORDn semantic
float2 screenUV = (i.positionCS2.xy/i.positionCS2.w)*0.5+0.5;

While the single line 【float2 screenUV = i.positionNDC.xy / i.positionNDC.w;】works because the y component of positionNDC has already get flipped in the GetVertexPositionInputs() method in the vertex shader. You can see the method here: <link>

No, all interpolated values get perspective correct barycentric interpolation by default, except SV_POSITION.

As an example of what I mean by perspective correct barycentric interpolation, here are quad meshes (two triangles) with the same UVs and texture being displayed in three slightly different ways.
9274557--1298793--upload_2023-9-5_11-36-32.jpg
On the left is the quad rendered facing the camera normally. On the right is it rotated 45 degrees away from the camera and scaled so it’s the same screen space height as the first.

In the middle is… well, it could be the two triangle quad with the top two vertices moved closer together, or it could be the rotated and scaled quad with perspective correct barycentric interpolation disabled. They look exactly the same so it’s actually impossible to know from that image which one it is!

In this case it happens to be the same rotated and scaled quad as on the right, but with perspective correction disabled. But my main point is the per vertex UV data in all 3 of these examples is exactly the same and is unmodified by the GPU. The bottom left is (0,0), top right is (1,1), etc. The only difference between these is how the data is interpolated across the triangle. It’s always barycentric interpolation, ie: 3 point interpolation, because they’re triangles. But the interpolated data can either get perspective correction or not. If the middle one happened to be a quad with the top two vertices moved closer together, it would look the same with or without perspective correction since all 4 vertices are the same depth, so there’s no perspective to correct for. The left looks the same in both cases as well for the same reason.

Now, let’s do one more test. Let’s use screen space UVs.
9274557--1298805--upload_2023-9-5_12-7-43.png
Similar setup, but now instead of using the mesh UV, we’re using the screen space positions (this is using the built in renderer, so it’s using the values from ComputeScreenPos(), but it’s the same as if I was using positionNDC. The middle and right examples are both geometry that’s been rotated 45 degrees away, but in the middle that’s doing the divide by w in the vertex shader, and the right is doing the divide by w in the fragment shader.

The important question is why do we need to do the divide by w at all? It’s to undo perspective correction. In fact, the right is also what doing the divide by w in the vertex shader and disabling perspective correction looks like. Or what a mesh with the top two vertices moved closer together would look like either way. The left looks the same either way because again, there’s no perspective correction to do.

SV_POSITION is a different beast, because the data assigned in the vertex shader stage is not the data that the fragment shader gets. In the vertex shader you set the homogeneous clip space position for that vertex, and in the fragment shader you get the xy pixel position, z z depth, and w world depth (or 1 for ortho). Actually, the w is the only value from the homogeneous clip space position that remains untouched.

As for why your positionCS2 use isn’t working and positionNDC is… in what way isn’t it working? Technically those two don’t quite match, but they should be very similar. As long as you’re setting positionCS2 = positionCS and passing the full float4, you should get plausible screen UVs with that code. They might be flipped upside down in some situations compared to positionNDC, but it should work.

4 Likes

Sorry for my misunderstandings. You are right. I didn’t recognise the y component is flipped. I have updated the previous post.
Thank you Bgolus, always learned a lot from you.
Here is my new understanding of the division by w in the Fragment Shader (I hope I understand it right)

Every attribute we access in the Fragment Shader is calculated from “perspective correct barycentric interpolation”. A useful link here: Microsoft Word - lowk_persp_interp_06.doc (nus.edu.sg)
8995615--1299751--perspective correct barycentric interpolation.png

  • (Îą, β, Îł) is the barycentric coordinate of that fragment(pixel) in the triangle ABC.
  • I_A, I_B, and I_C are the attribute values of the vertex A, B and C.
  • Z_A, Z_B, and Z_C are the camera space depth of the vertex A, B and C.
  • Z_t is the camera space depth of that fragment(pixel).
  • I_t is the interpolated attribute value of that fragment(pixel).

In the Fragment Shader, we can divide the positionNDC by w to get the screenUV:
float2 screenUV = i.positionNDC.xy / i.positionNDC.w;

8995615--1299745--perspective correct barycentric interpolation2.png

Assuming I_A is the positionNDC of vertex A, dividing it by the camera space depth of point A yields the new attribute Q_A. It is obvious Q_A is indeed the screenUV of vertex A.
Therefore, performing barycentric interpolation between Q_A, Q_B, and Q_C, the fragment will naturally result in screenUV as well.

If I calculate the screenUV in the vertex shader, pass vertex A’s screenUV as I_A into the above formula, the result I receive in the fragment shader stage will be the interpolated value of 【screenUV/depth】, not screenUV.

That is why I cannot calculate the screenUV in the vertex shader, pass it to the fragment shader.

If I want to have an attribute to be linealy interpolated in the screen space, then I should multiply it by its camera space depth (like i.positionCS.w or i.positionNDC.w) in the Vertex Shader. I am wondering except for screenUV, what else attribute or interesting effect needs this trick?