Which of these two processes would lead to better performance:
One fbx chair imported into the Unity project. Then duplicated 500x times in a scene. Single material. So 500 gameobjects.
500x chairs combined in Maya and then imported as a single large fbx and placed as a single unit into a scene. Single material. 1 gameobject.
I am 100% set on the chair design, so future iteration of the chairs doesn’t change things. I just want better performance.
If I’m understanding things right…
Performance wise, it doesn’t make a difference. 1 gameobject or 500 gameobjects- unity still has to load everything into memory and render all the chairs with a single material as a single drawcall for a frame.
Its possibly worse- because:
I’d be wasting time combining and assembling things in a DCC.
I now have a massive fbx so it could adversely affect build times? Or does mesh compression work for me here?
It makes lod groups and culling a pain as the mesh is so much larger.
This leads to my next question:
Can I break everything in that room into unique objects? So 5 unique objects duplicated multiple times in scene- 50 light fixtures, 500 chairs, 250 window panes, etc etc. 1000… 2000 gameobjects is fine?
At what point should I combine things and not combine things? This is assuming I’m being smart about my materials and textures btw (re-using them).
Combined meshes can improve performance sometimes, but it depends on a lot of things. Have you used the profiler to determine what is causing your performance problems?
Between the lighting and reflection and sheer number of high-polygon items in that room (I’m looking at the fine contour of the seat cushions, all the curves and bends), it’s an order of magnitude more complicated than pretty much any modern AAA console game scene I’ve seen recently.
Getting that scene to run well on an iPhone is going to be a pretty tall order. Have you even tried an empty auditorium replacing each seat with a Cube and using that lighting on your phone hardware?
Remember: how it runs in the editor is irrelevant to how it will ultimately run. In addition to the profiler window mentioned above by kdgalla, check out the Frame Debugger to get more info on what all is being shoveled to the system… and run BOTH of those things on the hardware itself.
Damn, nice scene! Are you sure that’s not a photo?
I recommend that you start by profiling your application. You need to know what is the bottleneck on the target hardware. (Profiling on your development machine will be enough initially but you really should find out what tools you can use for profiling on the iPhone.)
You will usually find out you have one or two major bottlenecks that make all other optimizations meaningless. You will notice that your optimization attempts don’t result in any appreciable changes in performance, simply because most of the time is wasted somewhere else.
Try to find out:
Is the project CPU or GPU bound? (at given rendering resolution)
Does reducing the rendering resolution improve performance by a lot?
If it does, it means you are GPU-bound, and your draw call is probably reasonable.
Consider lowering the rendering resolution or using upscaling to improve performance
Is memory usage reasonable?
How many draw calls/draw events are being issued to the GPU in order to draw a frame?
Which rendering features do all these draw calls correspond to?
e. g. skinning and other compute, depth prepass, shadow pass(es), opaque rendering, transparent rendering, deferred lighting, post processing + AA + upscaling
(expect FUN SURPRISES here)
What are the 10 top most expensive (longest duration) drawcalls? Does their cost seam justified?
You can use RenderDoc to see all draw calls and their approximate duration
(expect FUN SURPRISES here)
Are there any objects that are rendered in separate draw calls instead of a single instanced draw call?
In RenderDoc you can scroll through all draw calls and see how each batch of objects is drawn on the screen, so it’s easy to visually find workloads that aren’t merged as expected
(expect FUN SURPRISES here)
Is there much overdraw?
How densely are the triangles packed in the scene? Are there objects that could use LODs?
Keep in mind that too many LODs can be bad for memory and draw call count also.
3 LOD levels max
Are there any draw calls that waste time by trying to draw meshes/triangles/pixels that are fully occluded?
Going through this list will immediately give you lots of specific optimization ideas.
tldr check out RenderDoc, it will change your life
Back to earth. Rendering a single mesh 500 times using instancing would be more efficient because the mesh is stored only once in memory (better cache utilization etc). Instanced drawing is very fast compared to issuing a draw call per object. But merging objects is also a good way to reduce the number of draw calls, especially if you manage to find ways to merge materials.
Always verify that objects are correctly instanced as you expect. Batching is so easy to break, I assume it’s broken by default until proven otherwise. (I prefer RenderDoc for analyzing this over the frame debugger because it gives me a more objective “ground truth” view of what is happening on the GPU. I also prefer it because it’s the one tool that supports my GPU)
Reduce # of shadow casters for realtime lighting situations, which can make a HUGE difference in performance especially if you have a level with thousands of tiny separate parts. It’s more performant than batching.
Reduces batches and draw calls for the cpu performance IF AND ONLY IF they share the same material.
It’s the smart way to build levels, particularly when they share a material. Pretty much every platform you are going to benefit (usually)
64,000 vertices is max you should usually combine, less for mobile as that is more vertices bound gpu wise.
In addition to combining objects that share a material, prioritize combining meshes that are close to each other because it allows you to more efficiently cull anything not fully visable to the camera.
I agree with @Kurt-Dekker that the scene example is going to be tough to run on an iphone, in my opinion you would be vertices bound
A downside to combining large meshes, performance wise, is that Unity does culling per-object. So if you have a very large object, but you’re only seeing a little corner of it, Unity is still going to process all of it’s vertices.
So in general combining is good, but if you combine that entire scene, you’ll end up with drawing a lot of off-screen data. So you’ll want to play around with how to combine the chairs - probably a section at a time.
For something like this, you’ll simply have to do R&D to figure out what gives you the best perf. It might be that a different render pipeline is better. It might be that writing code to do more aggressive LODing than what lod groups does is useful. It might be that Unreal is just faster here and that’s what you need.
Note that you can combine the meshes in the scene, rather than in your DCC, if you’re up for some scripting. That’ll save you time assembling stuff in the DCC.
See my previous reply. People are just repeating stuff. You are probably going to need some mesh decimation as you will be vertices bound regardless. Those seats look pretty high poly so you are going to start there. Give me some numbers from the profiler and/or stats window and I can most likely see exactly what you need to do. Let’s just get this done, right here right now.