Huge performance hit with HDRP vs Built-in pipeline

Hello,

I’m working on the new version of my tennis game using Unity 2019.1.14f1 + HDRP 5.16.1 .

And I just compared the performances with my previous tennis game, in 2 nearly identical views, using Unity 5.6.7.

On the top right of the screenshots, I’m checking the GPU usage is ~100% ; on the bottom right, you can see the CPU usage.

My rig : Widnows 10 + Intel I7@4.6ghz + 32GB + NVidia GTX 1060 6GB at 1920x1200, with the latest NVidia drivers.

Only post processing effects in both games for the test were exposition & bloom.

In build, simple scene, no shadow ; 165 vs 590 fps, HDRP is ~250% slower :


In build, complex scene, even the crowd has shadow ; 125 vs 315 fps, HDRP is ~150% slower :


Note : each crowd mesh includes dozen of people, not only one.

The legacy game uses the deferred path, because the forward path was much slower with the stadium shadows.
The new game uses the forward path, because the doc states it’s faster ; although I tried the deferred path and the performances seemed similar.

Both games use realtime GI + linear color space.

Both games use reflection probes. The new one uses 5 instead of 1, but after checking, it doesn’t change noticeably the performance.

Here the stats from the editor for each screenshot, top left is HDRP Simple, bottom right is Legacy Complex :

The new games use materials for the stadium instead of composed textures, and thus there are more draw calls, although nothing crazy (any good PC rig can handle thousands of these).

Note : the stats with in Unity 2019.1 seem bugged, as the tris & verts counters count only the skinned meshes ; disabling anything else doesn’t change them. Even the shadow casters wrongly says 0 for the complex scene as the shadows were on.

So anyone would have any idea what’s going on ? Are there obvious mistakes that could create such a performance hit ?

Thanks in advance for any tip ! :slight_smile:

1 Like

AFAIK, HDRP is not done so there’s that, and also it’s supposed to be really good and fast when you take advantage of its strong points, like using a lot of lights.

It looks to me like your game may be better served by LWRP or Built-In.

Note : I had mistakenly tested with my 2019.1.12f1 build.
With the 2019.1.14f1 build, the complex scene fps are up to 125 from 100. It’s a bit better, but it should be at least at 200 fps to be reasonable considering the “complex” scene isn’t that heavy.

I guess now I have to test with 2019.2 … :stuck_out_tongue:

However, between the 2 Unity versions, I also had turned off the Contact Shadows support (although I didn’t use them), so maybe the fps boost came from there.

@AcidArrow ,
I’m already fan of the HDRP in a general way ; it’s hard to set up & dig into at 1st, but overall, it’s a major step up over the legacy built-in path. And the LWRP has some serious quality limitations compared to the HDRP (I’m in the process of changing the legacy assets seen in the screenshots above :wink: ).

I found some HDRP optimization advices by a Unity dev, I’m going to deep into them and see if anything has a meaningful impact.

The fps boost wasn’t from the Unity version change, nor the Contact Shadows support, but from switching to the Deferred rendering. So it’s actually faster than the Forward rendering… :sunglasses:

After I turned off everything unneeded in the HDRP asset & the Frame Settings, and it didn’t seem to change a single anything in term of performances.

I turned off the SRP Batcher and got back my tris & verts stats. The complex scene gives 5M tris + 6 M verts, vs 1.6M tris + 10M verts with the Built-in version. In the simple scene, it’s 64k tris + 70k verts vs 31k tris + 55k verts, so it’s much closer. I’m not sure what to think about all these numbers. It could plead for a 50% slower rendering, I guess.

I’m going to try to display the old assets in the HDRP game, it might ease the testing.

Ok, I put back the old stadium & most of the props. It now gives for the complex scene 1.9M tris + 8M verts vs 1.6M tris + 10M verts with the Built-in version, and 170 fps vs 315 ; still 85% slower for the HDRP.

I’d settle for a 20% loss, so hopefully the future HDRP versions will be a bit more optimized, because right now I don’t see what else I could do to speed things up.

These benchmarks are kinda pointless, measure the gpu cost in ms if you want to get some real figures (don’t use FPS as it doesn’t really tell that much). Also would suggest testing on weaker GPU to get some use cases where the perf difference actually matters: 170 fps is probably fine for 100% of your player base…

As additional note: do you really need realtime GI for this? :slight_smile:

@rz_0lento ,
fps is what actually matters for the end user, it’s the only metric that counts in the end. Plus it’s super easy to test, and here I need to check ballpark figures, not small gains. :slight_smile:

A few user expects to play with their 200 hz gaming monitors, a lot more expect to play with their 120/144 gaming monitors. Others will be expecting to play in 4K. And others will be playing with less powerful GPU, as you have pointed out.

Plus, it’s not 170 fps with my current assets, but with the old ones ; moreover I expect the final fps to lower more & more as I’ll be adding stuff along the way, so now is a good time to get a general idea about where I stand.

Lastly a GTX 1060 is pretty mid-range now, so it’s a good reference.

So anything result under 200fps in my little test is problematic. :frowning:

BTW, what would you use for measuring the GPU cost in ms ?

For end users, yes, but you are now posting on a developer forum. Using FPS for perf measurements is bad because it is a sum of many things, you don’t really see where you actually sink the perf at all. Use Unity’s profilers to see what is expensive and what is not and you don’t have to guess or use some super rough ballpark methods like measuring FPS.

Those few with 240Hz 1080p monitors most likely got a decent GPU already :slight_smile:

All this being said, at such light use case project as yours is, you can’t beat built-in renderer in perf with HDRP, or even match it. HDRP has more initial overhead.

My initial goal was to see how the performances compared between the 2 engines. Thus the Fps was the best tool for that.

Considering the huge performance hit, my 1st idea is that I did some newbie mistakes (I had read the HDRP doc, but there are really sparse).

Diving into the Unity Profiler gives very little bit of info, except “waiting commands” without further detail for nearly 75% of the rendering thread ; and anyway, the rendering thread is CPU time, not GPU time, so it’s nearly useless. So it’s not that that will tell me if I did a mistake or not, and what could be optimized.

So if the HDRP is nearly 100% slower than the Built-in in a normal case, I think it’s really a problem.

In another topic about this performance issues, I read a Unity dev bragging the HDRP would shine when using 200 lights on screen. Ok, great, but how many games actually need to show 200 lights most of the time ?

A base scene like mine shouldn’t be that much slower. A 20% hit would be understandable due to the extra stuff, but nearly 100% isn’t.

So I still hope I missed something important, or that they optimize the hell of it for the official release… :slight_smile:

1 Like

No it’s not. HDRP is supposed to scale really well. Built-in doesn’t scale that well. It’s supposed to be used when you need a ton of lights and other high end features and it’s intended for high end platforms. It’s supposed to have a lot of overhead, but then scale really well.

If that’s not suited for you, use LWRP or Built-In.

@AcidArrow ,
I already answered you above…

You did, but if you are a fan of HDRP, you need to accept that it has higher overhead than the other options, that’s how it was designed to be.

I’m more after understanding what’s going on, so after I could do a more educated choice, wether it’s tuning some of my assets, working on the rendering side, or like you said, just accept it’s slower. Right now, I’m in the blank, and that would be stupid to not look more into it and not fix the things I could fix… :slight_smile:

I’ve read all the blog posts about HDRP, all the HDRP doc, and a few posts on the forum by Unity devs, and right now, I feel it’s really hard to know what’s the deal with HDRP (just check that other guy post to which I answered as well, he’s in the same boat than I) ; I guess there’s a higher overhead, but it should be explained & documented in details, so we could understand what we can do to limit it, to take advantage of it (I hope it’s not only to render 200 lights, coz that would seriously limit the utility of the HDRP).

Right now HDRP is the obvious choice of rendering quality, but the performance cost shouldn’t be that high.

I’ll quote 2 things from that blog’s post : https://blogs.unity3d.com/2018/03/16/the-high-definition-render-pipeline-focused-on-visual-quality/

Both of these things made me think that HDRP was not the slow hog I got with my test scene. (except if we consider that a GTX 1060 is too old to harvest the HDRP awesomeness)

In that post, they also never talk about overhead, nor lower performance. They only state you need recent hardware.

And the 1st quote could mean I didn’t configure something correctly, although I checked everything I could think of, but as I don’t master all this, I may have had some oversight, thus my reaching for help in this forum… :slight_smile:

2 Likes

I just did another test, on the complex scene, but without the shadows on the crowd, and using the legacy stadium in the new game, in exclusive fullscreen mode (before I had tested in windowed mode ; exclusive mode is ~5% faster with Unity 2019, and ~10% faster with 5.6).

I wanted to see the fps loss when using a bigger resolution, so I tested 1920x1080 vs 2560x1440 : this is 77% more pixels ; here the results :

  • Built-in : 373 vs 235 fps ; 59% slower
  • HDRP : 203 vs 134 fps : 51% slower

This is quite similar. So if there’s an overhead, it’s mostly in the pixel shaders, which means there’s a real performance issue right there, as quality wise the shaders are pretty similar looking when using 1 directional light, GI, normal & mask maps.

You could try 6.9 HDRP, which seems a bit newer.

New test with Unity 2019.2.2f1 + HDRP 6.9.1, in the complex scene, with the new stadium, no SSAO nor FSAA, 1080p vs 1440p :

  • 2019.1 : 156 vs 103
  • 2019.2 : 154 vs 107

So the new version is a tiny bit slower in 1080p and a bit faster in 1440p. I’m not sure what to think about that.

Bonus note : the SSAO in HDRP 6.9.1 is completely broken (I opened a new topic about that issue :stuck_out_tongue: )

Double bonus : the Fps in the 2019.2 editor are super low (around 60 fps instead of ~85) ; if it gets a bit more low, I won’t be able to easily test my gameplay in the editor anymore… :face_with_spiral_eyes:

As one of the particularity of the SRP is that it’s scriptable in C#, I tried to build using IL2CPP, thinking it may help to run faster.

  • still 2019.2, 1080p vs 1440p : 147 vs 108

So it got a bit slower in 1080p and a tiny mini bit faster in 1440p. Once again, I don’t know what I should think about that… :smile:

So I downgraded my game to use Unity 2019.1.14f1 Built-in pipeline.

In the same condition than the previous 3 messages, I got :

  • 1080p vs 1440p : 260 vs 174
  • compared to the best HDRP, it gives : +~70% vs +~60%

I guess the lower boost on 1440p means my GPU is closer to its limits.

Side note :
Test with Light Probes, 1080p vs 1440p : 243 vs 171
Test with Light Probes + SSAO, 1080p vs 1440p : 218 vs 145

PPSv2 SSAO @ takes about 0.5ms, which is in par with the one from HDRP 5.16.1 .

So I’ll stick to the Built-in pipeline, at least till the HDRP is stabilized & optimized, and works on Intel Iris. HDRP looks better for me, though… :frowning: (but apparently, not to my users nor the Unity users :stuck_out_tongue: )

EDIT:
the poly count is slightly lower in the latest version of my stadium, but it gives less than a 1% boost. (I just moved a bit the camera in the HDRP test to get more or less the equivalent of the new stadium :smile: )

Well, I wouldn’t create a game on HDRP to target such devices ;p

What you could try is adding volumetric fog, GPU particles, subsurface scattering and deffered decals to the built-in and then compare. Or replicate this setup and run it at 30 FPS, 1080p on PS4. To this I’d say: Good luck :smile:

@alexandre-fiset ,
you may want to read above : Huge performance hit with HDRP vs Built-in pipeline , or here : Can you guess which one is HDRP rendering vs Built-in rendering ? (and vote for the most beautiful) - Industries - News & General Discussion - Unity Discussions .

Note : I’m not targeting IGPs, I just want to support them as they are a non-negligible part of my income.