No colliders, just usual components like LocalToWorld and moving aspect.
Profiler shows only big Gfx.WaitForPresentOnGfxThread difference (30 ms vs 7 ms).
What are the differences in the hierarchy of the inspector?
Is vsync enabled? Maybe it just slightly tips to the next vsync tier, making it seem like a huge difference
Overdrawing the same object 7000 times, vs. drawing the object 7000 times in a different location, does not explain why the FPS is tanked. I’m curious about this one.
The render thread clearly is higher. Overdraw can be more expensive than drawing next to eachother.
Did you try in a build?
Drawing instanced from the CPU might be worth trying to improve performance as well
I don’t get it - on a technical level, why should drawing the object in the same spot be slower than drawing the object next to each other? It’s doing the same z-tests, the same geometry, etc. In forward mode, it would be doing the same number of lit pixels. In deferred, again, all the same number of lit pixels.
Do you have a solid explanation in terms of buffers/tests and GPU architecture?
It’s one thing to say that it is slower because they are drawn in the same spot (which we can infer from the results and conditions), but quite another to explain why. I’m genuinely curious to learn the answer.
I work a lot with mobile, and tiled GPUs can be very picky on certain depth optimizations. When stuff overlaps it will often run the pixel shader for that pixel multiple times (where everything side by side would be once per pixel).
I am not fully sure how this translates to regular GPUs however, but I think this can also make sure the same pixel shader is ran multiple times which might break some optimizations. (or some optimizations being broken in general because of it?)
This is purely speculation however
You could try the rendering debugger and frame debugger to test out some stuff and see how that changes your performance.