I was recently looking into how Unity combines meshes when static batching is in use and I noticed a thing which was quite surprising for me. It’s about how Unity decides which static meshes should be combined together into a single mesh.
I guess that there are a few factors which affect this process but my expectation was that Unity will generally try to combine two meshes together if there are placed closely to each other in the scene (world space). Such approach seems reasonable for me because meshes placed closely to each other are very likely to be rendered one after another (the reason being Unity uses strict front-to-back sorting for the Geometry render queue). If both meshes are in the same combined mesh, then they can be rendered in a single static batch.
The problem is that I don’t observe such a behavior. When I enter the play mode in the editor and check how the combined meshes look like, I usually see that such a mesh consists of many static meshes spread across the entire scene with empty space between them. The final result is that the entire static geometry is placed in a few combined meshes interlacing each other over the entire scene.
I think that the number of static batches need to render the scene could be much lower if Unity combined meshes in a more compact way.
There are several factors influencing the batching choices. The main being that the materials need to be shared, but also less obvious things like maximum vertex counts per batch, … the batching documentation gives a bit more background on this Unity - Manual: Draw call batching
When I run the play mode in the Unity editor I can see that all static geometry in my scene is split into 9 combined meshes. All combined meshes look similarly to these:
Just to be clear: every single picture above shows a single combined mesh. All static geometry in the scene use a single material.
As you can see it seems that Unity does not take into consideration the location of a static mesh in the world space when deciding which meshes to combine together The resulting combined meshes are not compact at all - every combined mesh contains static meshes spread across the entire scene.
My understanding is that such approach significantly decreases potential performance benefits which one could get from static batching - Unity uses strict front-to-back rendering order so it has to perform many switches between different combined meshes (i.e. static batches) when rendering the scene. I believe that if static geometry was combined in a more compact manner, then the number of such switches could be greatly reduced.
I came to the same conclusion in all of my projects.
Static batching either reduced my fps or wasn‘t worth the increased client and memory size.
Many objects aren‘t culled any more as the resulting static mesh and its bounding box just spans across the entire level.
It‘s just split when a certain vertex limit is reached afair.
Therefore many people write their own systems.
Some use grids, others hex fields (as in combination with an LOD system hex fields work nicer).
An asset that uses grids is in a current Humble Bundle (Mesh Combine Studio 2): https://www.humblebundle.com/software/unity-games-and-game-dev-assets-software
When combining meshes yourself you are also not limited on scene objects any more but you can also combine your dynamically instantiated objects.
Didn‘t try it out though as I‘m using my own baking scripts using Unity‘s MeshCombine API.
It‘s a pity though as it complicates the workflow more than it should.
Some of your observations might be correct, but it seems that you are missing the full picture of how static batching works in details.
Effect of static batching is not the same as if you would manually combine given submeshes into a single mesh, with a single mesh renderer attached. The main reason is the static batcher does not automatically forget about division into submeses after a combined mesh is created. After a combined mesh is loaded into GPU, it still can process submeshes independently, thanks to the ability of issuing draw calls with different index buffers (one index buffer = one submesh).
In particular, it means that after combing static submeshes, Unity is still able to:
perform frustrum and occlusion culling on a per-submesh basis
render submeshes, which were not culled, in a front-to-back order to minimize overdraw (assuming the geometry render queue is used).
That is why static batching (in theory) is a more powerful technique than any solution based on manual mesh combining, and in result it has a potential to be much more efficient.
The problem is that the current Unity implementation of static batching seems to be not very clever at some points, what essentially wastes that potential.
Recently came across the same problem. For some reason this is one of the only pieces of documentation on this limitation/misimplementation on one of unity’s main performance optimizations.
Static batching has enormous potential , and for static objects not even comparable to gpu instancing.
Tried Mesh combiner studio 2 , and have to say it can work in many cases , but my particular scenario would be better off with static batching, even in this bad state it is in.
Any solutions so far?
Found a solution → Do the following with a editor script:
Sort the to be batched objects in a array of new parents based on position.
Call Staticbatchingutility.combine on every single parent
WAY better batching results, decreased my drawcalls 30x fold