I have a shadow map atlas, with which i need to render an object into each cascade. I could just call SetViewport and Draw for each cascade, but im trying to implement it using instancing, and i cant call SetViewport per instance, therefore i need to emulate viewport functionality inside the shader. I managed to do it for directional lights (i.e. non-perspective lights), but for punctual lights it doesnt work. Heres the code for directional lights:
Vert:
o.position = mul(_ShadowmapAtlasViewProjMatrixArray[instanceID], float4(posWS,1));
//Easier to work in 0;1 space, remap back into -1;1 in the end
o.position.xy = o.position.xy*0.5+0.5;
o.positionInsideViewport = o.position.xy;
float4 viewport = _ShadowmapViewportOffsetMultiplierArray[instanceID];
/*Viewport is :
Vector4(
shadowRequest.atlasViewport.xMin,
shadowRequest.atlasViewport.yMin,
(1f / atlasWidth) * (atlasWidth/ shadowRequest.atlasViewport.width),
(1f / atlasHeight) * (atlasHeight/ shadowRequest.atlasViewport.height)
);
*/
//This gives us viewport.zw = (viewportSize / atlasSize)
viewport.zw /= _ShadowmapAtlasSize.zw;
viewport.zw = 1 / viewport.zw;
o.position.xy *= viewport.zw;
float2 offset = viewport.xy*_ShadowmapAtlasSize.zw;
offset.y = 1-offset.y-viewport.w;
o.position.xy += offset;
//Remap back into -1;1
o.position.xy = o.position.xy*2-1;
and in Frag i just clip based on positionInsideViewport:
clip(input.positionInsideViewport); // clip negative value
clip(1.0 - input.positionInsideViewport); // Clip value above one
I tried doing perspective divide before all the transformations and inverse after transformations, e.g.
o.position.xy /= o.position.w;
o.position.xy = o.position.xy*0.5+0.5;
...transformations here...
//Remap back into -1;1
o.position.xy = o.position.xy*2-1;
o.position.xy *= o.position.w;
but this gives incorrect results. Any help is appreciated.
This isn’t safe to interpolate only two components of the clip space projection in a perspective projection, which is part of why works for directional lights, but not for punctual lights. But…
This is indeed the other reason. In an orthographic projection, the w component if always 1, so there’s no need to do the perspective divide, or rather there’s no difference between doing it or not. To make the modifications to the projection work with a perspective, you do need to do the perspective divide, or at least take into account the w component, when modifying the clip space xy.
However the o.positionInsideViewport is even easier. It should be using the o.position.xyw before any modifications!
And we’re … not done? Unfortunately there is some noise around the center axis clip that I don’t fully grok, since it’s perfect if the clipped edge doesn’t fall precisely on that edge. Likely some fun floating point funkiness.
Luckily this only appears to be a problem if the render target resolution is an odd number. Even number resolutions the edge is straight as you’d expect, so in your case it might not be a problem.
Not that it matters for your use case, but it finally clicked in my head why it was having problems with odd number resolutions. Floating point interpolation means the center axis pixels are going to be equal to 0.0 +/- a tiny bit. That tiny bit means it’s basically up to chance if it’s going to exactly match 0.0 or be slightly greater / less than 0.0. The best fix is probably to adjust the scale & offsets to align to pixel dimensions for each viewport, or to convert to floored pixel space in the shader to do the clip.
If anyone is looking this up in the future - dont use fragment shader clipping/discarding as was initially proposed. Its terribly slow, since youre wasting threads on simply checking if pixel is visible or not (at least thats how i think it works). But thankfully theres a much better way - SV_ClipDistance semantic (Semantics - Win32 apps | Microsoft Learn). The idea is that you output distances to any arbitrary planes from your vertex shader, and hardware automatically does clipping based on those distances (clipping where the distance is <0) without wasting precious threads.
Declare an output parameter in your vertex output struct with SV_ClipDistance semantic of size equal to the count of clip planes (i use 4 cause i dont need near/far clips in this case, hence float4):
struct v2f
{
....
float4 clipDistances : SV_ClipDistance;
//Or, if you want more planes
//float4 clipDistances0 : SV_ClipDistance0;
//float4 clipDistances1 : SV_ClipDistance1;
//etc...
};
Then in your vertex shader write distances to the planes. In my case i convert world positions to view space and do a dot product against a frustum clip plane normal (my clip planes are in view space as well):
Small warning: this option is supposedly super bad on some hardware. Probably not worse than the clip method, but I have seen some discussions on Twitter between people complaining that it’s uselessly bad on some hardware… but I can’t remember which.
My vague memory is they were complaining about AMD GCN GPUs.
rummages around twitter
Looks like AMD is fine. It looks like the tweet that calls out the specific GPU that they were having problems with in the thread I remembered has been deleted (specifically the author deleted their twitter account). So shrug. Seems like 2019 era AMD is fine, presumably Nvidia is fine, so maybe it was some mobile device.