I have noticed that forward add passes are no longer required in the URP renderer, instead, a single more complex pass receives the light data and renders the mesh+mainlight+additionalLights in a single pass. How much of the overhead is this removing/adding, should we expect better lighting performance under these conditions or is it just a different way of getting the same result as the built-in renderer?
This should probably be mentioned in the docs if not already as it is a significant implementation detail.
The advantage is reducing draw calls (as before each renderer would need to be redrawn for every light that affects it) and bandwidth (each forward add paints the model over itself with additive blending enabled). The disadvantage is a more complex shader with more ALUs.
Whether that will run faster depends on the target hardware and scene. Overall, in mid and low end GPUs ALU performance has been increasing at a faster pace than memory bandwidth. With the exception of very old mobile GPUs, the URP approach should be faster.
The problem comes if you’re targeting really terrible mobile GPUs that should have been abandoned a decade ago but are still around in brand new budget phones, namely the goddamn Mali 400, which is so damn slow a single per pixel light is “way too much” for it.
6 Likes