Hello!
I’ve been using Unity for a while, but currently in a project where I’ve been encouraged to push technical limitations which is always fun as a developer! I know a bit about the Unity pipeline, but not so much about the finer details, especially in regards to batching draw calls, etc… Honestly that’s an area where my knowledge of engines in general drops off a bit so forgive my ignorance.
So currently I’m working on a massive GPU based particle system to simulate some natural phenomenon. (Built a GPU based numerical ODE solver which I’m pretty excited about!) Anyway, I’m holding position and other per-particle data in a structured computebuffer that I pass as a structuredbuffer to a shader. So with this, I’m looking at different approaches for rendering the particles. They will most likely textured quads, but potentially with a few extra triangles if I can manage. Ideally I’d like a single system to render up to 1,000,000 particles at VR framerates (at least 60, 90 much pref’d). I don’t think it’s unreasonable with new hardware. I’ve been developing on an MSI Laptop with a GTX1060 inside
The approaches I’ve been trying:
-
Geometry shader & Graphics.DrawProcedural (obviously not the fastest, but convenient for testing)
-
Batched meshes with many quads per mesh.
I’ve been following the approach here (note, just the rendering approach)
GitHub - i-saint/MassParticle
It’s interesting though because I think this system doesn’t actually take advantage of dynamic batching due to too many verts per mesh and material instancing. Reducing the maximum verts per mesh and using shared materials, I was able to get dynamic batching working and gain 10-20fps on this system. -
Regular mesh instancing.
Graphics.drawMeshInstanced()
This seems to break down pretty quick and frame rates plummet. -
Trying to create some mix of batching and instancing.
I haven’t quite gotten this to work, but it would be a mesh with many quads per mesh (staying within the limit to use dynamic batching.) and then instancing that mesh. Early tests seem like this isn’t the answer.
I think the VERY fastest approach would be to use Graphics.DrawProceduralIndirect with 4x as many point primitives as I have particles. The problem with this approach is having to do lighting and shadows completely manually (correct me if I’m wrong.)
General notes or ideas are very welcome. I have no hardware limitations. Will most likely be GTX1080s for the installation. Just trying to see how far I can push it and what the best approach is. Thanks for your help!