GPU usage mismatch

I’m writing a scientific viewer framework, kinda, for the HTC Vive. I have a GeForce GTX 1080. Suppose I have a 25x25x25 array of transparent cubes. At those dimensions, rendering is a little jerky. I open the Unity Profiler, and both my CPU and GPU show as being maxed out.


That’s fine and all. However, nvidia-smi, NVIDIA’s gpu monitoring program, shows the GPU usage staying at 7-13%. (Power and fan numbers support this.) I’m not sure that there’s even any correlation with whether Unity is running.

What I’d suspect is that Unity’s not using my GeForce card, and instead using some built in thing or something. However, the log says otherwise:

GfxDevice: creating device client; threaded=1
Direct3D:
Version: Direct3D 11.0 [level 11.0]
Renderer: NVIDIA GeForce GTX 1080 (ID=0x1b80)
Vendor: NVIDIA
VRAM: 8144 MB

So if the logs say Unity’s using my GeForce GPU, and Unity’s profiler says it’s maxing out the GPU, but NVIDIA’s GPU monitor says it isn’t, what’s going on? And, hopefully, how can I get it to render faster? (And if anybody knows a fast way to render a block of additively transparent cubes, that’d be cool.)

Well the GPU usage in percent doesn’t have to have any relation to how long certain processes take. There could be a lot overhead involved that slows down the overall processing. The GPU usage just refers to how much work the GPU cores currently do.

A good counter example would be a HDD. If it has a max write speed of say 300MB/s We measure the write-speed-usage based on that value. If however you write two or more large files “in parallel” the HDD head has to reposition after each block written as the different files would write alternating. The actual write speed you can now reach is much lower than the max possible. So even you only “use” say 10% of it’s capabilities it simply doesn’t go faster due to some sort of bottleneck.

In the case of the GPU that could have many reasons. Frequent flushing / reprogramming of the pipeline can be a problem.

May i ask how those cubes are actually rendered? Hopefully not has +15k gameobjects ^^. However since your “batches” line is that far up i might have hit the nail on the head ^^.

edit
I just had a quick look at your project (didn’t download just browsed some files). As i guessed you created a batching and drawcall “nightmare” and your profiler results show exactly that. First of all having that many GameObjects is already kind of bad for performance. But in addition you individually modify every material directly which will instantiate a seperate material for each of your “voxels”. If Unity would render 25³ (==15625) gameobjects seperately, it would be difficult to express your framerate in whole numbers. Luckily Unity usually does its best to batch things where possible. However setting different colors in the material will break batching.

Things like voxel systems are usually implemented by creating a large Mesh procedurally and just update the mesh (in your case just the vertex colors). As far as i have seen you’re not doing very complex calculations. This might even be easier to implement in a shader so you wouldn’t have to fiddle with each “voxel” manually on the CPU.

As i said i haven’t downloaded your project so i don’t even know what your voxel prefab looks like. Is it a cube or something else? A single Mesh in Unity can only have 65k vertices. If you have 25³ cubes that would be 252525*24 == 375000 vertices. So you would need to split your “big cube” in at least 6 Mesh objects to cope with the the amount of vertices.

Depending on your target hardware if you have access to DX11 you could use a geometry shader to create the voxel-cubes inside the shader. So each voxel would only require a single vertex. Though that are many “unknowns” so it’s difficult to suggest anything specific. What is clear is you have to:

  • avoid using that many seperate gameobjects.
  • use a single material and work with vertex parameters of the mesh(es).