performance optimization for many tris/vertices

I have a runtime world building tool that uses cubes. Roughly 25% are higher poly count as I deform them.

The problem I’m running into is that combining the meshes isn’t really helping performance a whole lot. I think it’s because while the cpu is doing less work, it’s sending more to the gpu because it’s all combined. While not combined it’s not sending as much. I have a system to combine them at runtime when I toggle between build/play mode in the game. When batched the draw calls drop way down, but the frame rate is still really bad, and the profiler shows gfx.waitForPresent taking up all the cpu time when the meshes are combined. Frame rate combined vs not combined is ±10. I am grouping my combined meshes by material, so it’s one combined mesh for each material, with some further subdivision to allow for max verts.

Tri/vert counts are right at 1.3m in my test scene.

I don’t understand why there is this little difference between uncombined and combined. Any tips on what I might do?

Of course right after posting I found the main issue. It’s the tessellation shader I was using.

1 Like

In future, when you want to understand why things are slowing down on the GPU, I suggest using Intel GPA Frame Analyzer. Fantastic tool for looking at individual draw calls, times, state, resources, etc.