Use a giant triangle instead of quad for postprocessing

post-processing stack v2 have this method BlitToFullscreenTriangle() that they use instead of the normal Blit(). What it do is what you would expect: set the dest rendertexture as the render target, and then create a full-screen mesh that have the appropriate material and texture.

Is there any benefit to this, aside from using 1 less vertex? I find it weird to go through the effort just for that. AFAIK both will get turned to 2 triangle for rasterization, right?

Not sure if it applies to all hardware, but at least on some, the one giant triangle rasterizes faster due to the way GPU memory caching works.

Somewhat detailed post that covers it nicely: GCN Execution Patterns in Full Screen Passes – Michal Drobot
And another that shows a different example of what I believe is the same phenomenon: Humus - Comments

2 Likes

Even ignoring the memory access benefits, all modern GPUs render pixels not one at a time, but in groups of somewhere between 2x2 and 8x8 pixels. If a triangle overlaps even a single pixel of that group, the entire tile (AKA quad, warp, or wave) is run. For tiles where two triangles overlap it, the tile has to be run twice, once for each triangle. This means in the two triangle quad case, the entire diagonal down the middle of the screen is getting run twice.

7 Likes

Thank you both for the response!

I read somewhere that the triangle would be clipped into a 2-triangle-quad anyway. Is this not true? Or am I missing something here?

Totally false. The GPU isn’t going to make new triangles, unless it’s told explicitly to do so via a geomerty shader or hardware tessellation, neither of which are likely to be used for a full screen triangle.

I’ve noticed quite a bit of confusion around screen related clipping, including some official D3D11 documentation that’s outright wrong. A single triangle that covers the entire screen stays a single triangle. How much bigger the triangle is than the screen is essentially irrelevant to performance as only the pixels the triangle covers affect performance. Since the entire screen is covered, having a triangle that perfectly covers the screen vs is 1000 times larger still covers the same number of pixels. There are potential precision issues with going too big, but that’s a different issue.

2 Likes

Also this article: Vulkan tutorial on rendering a fullscreen quad without buffers -

AFAIK the benefit isn’t huge, but there is a benefit (to avoid repeat shading due to 2x2 render): Shader Optimizations | Krzysztof Narkowicz

2 Likes