From my tests difference between hipoly (4x65k tris - shaded specular) vs low poly (4x650 tris shaded with my displace shader stuff) works about 2x faster. However I could imagine that using hundred of such hi poly models ( 6.5M tris) would kill vertex unit while rendering of 60k tris of 100 models would be just fine - depends only on number of pixels shaded, so it scale great when we get far from object. We may think about it this way - no need to prepare LOD versions as we already use “lowest LOD” model no matter the distance, as we get close to the model its detailed structure smoothly reveals.
Of course we have to pay the price - object have to be static for obvious reason no skin anim available, lo-poly mesh that “holds” actual silhouette must be tight. Mean close to the geometry because ray tracing/traveling thru mesh volume is expensive, and so on. More issues are probably hidden somewhere (for example forward+shadows+postFX + alpha cutout shading). As usual, but this technique seems to be promising for certain applications. I think about it for sweeten architectural objects (fancy baroque puttos, decos on walls, etc). I guess it’s possible to atlas resources to minimize material count (-> drawcalls). And it gives us method to produce detailed look w/o necessary to use DX11 features (available on small number of platforms only - Win7 standalone/webplayer I guess ?).
We need relatively small displace map to get very smooth look/silhouette of object (in my example 128x128 heightmap would work just fine and would give us <25kB memory footprint). Even smaller (64x64) texture still uses bilinear interpolation so we could have very smooth (however not much detailed) visual representation.
Performance - I’ve got GT240 - easily handles hundreds of polys and it doesn’t hurt. Fragment shading unit is also reasonable. Results - 1280x1024 w/o shadows, forward, gives - hipoly (4 heads - 250k tris) ca. 1.6ms, low poly (2.5k) - 0.8-0.9 ms, resulting in over 500fps even when we cover the whole screen by these ugly king-kong faces.
I believe that the method will give good profits when we use many models shaded this way. On complex scene objects that are to be rendered and are occluded (z buffer) skips fragment shading. As our model is low-poly they would cost nothing. Optimizing geometry (occlusion) however is much harder, so I believe it would work great for complex scenes with lot of objects.
I know that gamers just complain about “this game sucks because I see low-poly edges”. DX11 tesellation solves it, and in a few years my solution probably will become quite invalid method, but for these incoming 2-3 years some great titles could profit from the idea.
Mac compatibility - it’s always pain in neck. Cg->GLSL auto conversion is still problematic in Unity. Much better than year ago, but still - complex shaders takes ages to compile so that I use #pragma only_renderer d3d9 when writing new stuff, then I check it on openGL mode. I’ll check it in forced openGL mode on my PC if it looks fine or not.
This shader will be available to buy as part of “Relief Shaders Pack” (together with “Relief Terrain Pack”) - both products incoming at the beginning of next week.
ATB, Tom