procedural mesh incredibly slow on iOS

I’m making use of procedural mesh generation in my latest iOS game and to my shock I discovered that these two lines:

_Mesh.vertices = _Vertices;
_Mesh.colors = _Colours;

are taking 11ms each to execute. Given that 16ms = 60 frames a second you can imagine how slow things are now moving. It’s not as if there are a lot of vertices or colours being copied over, something like less than 30k per frame.

I’ve tried marking the mesh as dynamic but that doesn’t improve the situation.

11ms is a long time, the rest of the whole of the game executes in less time than those two lines.

Anybody have any idea?

After a conversation on twitter it’s been suggested that because Vector3 is a struct rather than class then assigning vertices array to the mesh vertices array is going to involve a lot of copies. This sounds plausible but I still wouldn’t expect that to eat 20ms per frame.

It’s nothing to do with Vector3 or structs/classes (that’s not how arrays of structs work anyway…arrays are always by reference regardless). When you assign vertices to the mesh, they are uploaded to the graphics card, so it’s a copy rather than updating a pointer. But at least for the colors you should use Mesh.colors32 rather than Mesh.colors.

–Eric

I could understand a slight delay on CPU->GPU transfer on a Desktop machine because it has to traverse the bus,I used to work for 3DLabs designing 3D hardware and worked closely with the DX and Opengl driver teams and we worked damn hard to get that kind of transfer down in the drivers, but on iOS it’s shared memory architecture so there’s no graphics card in the same sense, so no transfer. There’s probably some internal house-keeping that Unity has to do but 20ms is still a lifetime in frame terms.

Interesting what you say about Colors32 though - I shall amend the code to see if there is any difference.

The problem is that what you do there will update the VBO every frame if this is all the code.

If you have dozens and more meshes or high poly meshes thats gonna down your game easily as drawcalls are cpu limited and so are VBO updates. It does not matter that the devices got faster, compared to the cpu power of desktop cpus, they are still absolute jokes and years away from even competing with Gen1 Core i processors in such a use case. Thats why even a crap computer can boost 4000+ drawcalls, while a tablet can be happy to go beyond 100 before you are capped at 30fps

For that reason you should consider which meshes you need dynamic and which not, as the meshes you allow to handle the mesh as a static one will not reupload the vbo every frame and be significantly more performant.

Just to be sure: don’t add mesh colliders onto procedural meshes just in case you do that or consider doing it, the update costs will be massive as the whole k-tree for the mesh bounding box area has to be recalculated on the fly by PhysX.

Changing from color to color32 had a vast improvement, assigning colours is now down to just over 1ms which may just give me the time I needed to claw back.

Just one mesh that dynamically updates per frame. Draw calls are an issue which is why I went the procedural mesh method. I have a requirement for axis aligned billboards in the game and rather than have n number of gameObjects generating ‘n’ number of draw calls it made sense to hand batch them, or so it seemed. I’ve done exactly this same kind of code in native Obj-c and OpenGL without any issues or performance hits. Before anyone asks about letting Unity dynamic batch them, that system seems to work or not based on the phase of the moon…

The profiler reports that the c# side of setting everything up per frame is less than 1ms so that’s all fine, it is literally down to the internal house keeping of Unity to update dynamic meshes. Given that Unity iOS is supposed to support procedural mesh work I would have expected this particular pipeline to not actually take that much time - again 20ms is a phenomenal amount of time to update one mesh.

Changing to Color32 has meant that the vertices update is now 10ms, which is still too much imho

My current fallback strategy is to go back to the many draw call strategy and hope the Unity dynamic batcher realises that it should be batching these calls.