Speedtree performance 5.3 -> 5.4 -> 5.5

Hello,
I know it’s beta. But problem is still there from 5.4
Case 818706
Case is still in Open status even after Unity replied to me that it was reproduced successfully, it was created 29.7.2016

After upgrading to 5.4 my FPS were approximately 30% worse than on 5.3.6p1 Now it looks like performance on 5.5 is even worse.

I have prepared simple repro project. It’s just blank terrain 1.5x1.5km with 10000 speedtrees of one kind. (deferred, linear)

Here are my results:
5.3
dx11 = 145 FPS

5.4
dx11 = 103 FPS
dx11 + jobs = 61 FPS
dx12 = 82 FPS
dx12 + jobs = 110 FPS

5.5
dx11 = 43 FPS
dx12 + jobs = 63 FPS
dx12 + jobs + baking of probes on terrain disabled = 137 FPS (still worse result than on 5.3 DX11 with light probes enabled)

Are you please aware of this problem?
Best regards
Peter

Well, we’ve had the known issue on the release notes Unity Editor Beta Releases. (last listed b10)

We believe we have it addressed not in b11, but the version after. (Fix only landed in the last couple of days)

1 Like

I was reading about this issue. But didn’t notice that it’s my case number. As it’s existing in 5.4 to some degree too.
Thanks

Updated results on 5.5.0f1
5.3
dx11 = 145 FPS

5.4
dx11 = 103 FPS
dx11 + jobs = 61 FPS
dx12 = 82 FPS
dx12 + jobs = 110 FPS

5.5.0f1
dx11 = 85 FPS
dx12 + jobs = 105 FPS
dx12 + jobs + baking of probes on terrain disabled = 140 FPS (still worse result than on 5.3 DX11 with light probes enabled)

So we are basically back to 5.4 performance. 5.3 is better by 38% of FPS and with light probes enabled.
Is this final optimization?
Thanks
Peter

UNITY 5.4.3

Test Scene

  • DX12 SpeedTree + WindZone + LODs + Dynamic batching (No reflection probes)
  • CPU: 4.2 ms
  • GPU: 1.7 ms

In-game context:

  • DX11
  • CPU: 13.8 ms
  • GPU: 5.8 ms

UNITY 5.5.0f1

Test Scene

  • DX12 SpeedTree + WindZone + LODs + Instancing + Dynamic batching (No reflection probes)
  • CPU: 3.9 ms
  • GPU: 2.2 ms

In-game Context

  • DX11
  • CPU: 13.9 ms
  • GPU: 9.2 ms

CONCLUSION

Saving little on CPU, losing too much on GPU
For us, anything that can optimize the CPU is good, but this is unfortunately not good enough for us to do the switch (saving 0.1 ms on CPU to lose ~3ms on the GPU is no fair trade, to be honest). Maybe in some contexts this tradeoff can more balanced, but in all ours 5.4 is faster.

More control would be necessary
Right now, unless there is something really big that I’m missing, there is no way to limit or turn on and off GPU instancing on the trees. This makes it impossible to find the right balance to suits our needs.

Comes with a bunch of glitches
We do not have the time nor the resources to test this out, but we found out that our LOD for our trees cause many glitches as we move in the forests of our game. This is not happening on 5.4, as if there are issues with the new instanced shader.

3 Likes

Hi Alexandre,
Your test scene looks nice. Would you mind share it with us so that later we can do some performance measurements against it.

Right now in 5.5 you can turn off speedtree instancing by copying the speedtree shader into your project and delete the multi_compile_instancing lines. We are developing a new workflow for instancing for the next release cycle to make it more accessible.

Please help us identify the glitch problem by submitting a repro project. We don’t see glitches in our own tests.

We are continuously working on the performance of instancing, and enabling more features (like supporting light probes). I’m sure things will get better and better.

Sure! I’ve sent you a link via private message so we do not share our hard-worked trees before our game launch :roll_eyes: