(Case 946966) GPU Instancing significantly slower than no batching at all (UWP/Xbox One)

Rendering the provided test-scene on Xbox One (UWP, retail console using developer mode) with “GPU Instancing” enabled, renders significantly slower than using no draw-call batching at all, even though Unity does batch a lot of draw-calls when using GPU Instancing.

Please see the provided screenshots.

3203618--244973--overview.png

Profiler - Static Batching

Profiler - No Draw-call batching

Profiler - GPU Instancing

Reproduce

Unity
0. Open user attached project

  1. Execute Mainmenu > BugReport > Disable GPU Instancing
  2. Open Player Settings and turn off Static and Dynamic Batching
  3. Press File/Build Setting
  4. Select “Universal Windows Platform” tab
  5. Turn on “Development Build” (to be able to attach the Unity Profiler)
  6. Build UWP Player with setting Target=Any Device, Build Type=D3D
  7. Open generated .sln file in Visual Studio 2017

Visual Studio
8. Switch “Solution Configuration” to “Release”
9. Switch “Solution Platforms” to “x64”
10. Switch “Local Machine” to “Remote Machine”
11. Open the “Wolf (Univeral Windows)” Project Properties
12. Select the “Debugging” tab in the “Wolf Property Pages” dialog
13. Set the “Machine Name” to your Xbox One name or IP address and press OK button
14. Open file “Package.appxmanifest”
15. Switch to “Capabilities” tab in “Package.appxmanifest”
16. Enable the following Capabilities (not sure which of these is actually responsible to make the Profiler work): Internet (Client & Server), Internet (Client), Point of Service, Private Networks (Client & Server), Remove System
17. Press in Mainmenu > Build > Build Solution
18. Run on Xbox One

Unity
19. Open Window > Profiler
20. Connect to running Player on Xbox One
21. Note the “CPU” and “Gfx.WaitForPresent” cost
22. Close Profiler
23. Execute Mainmenu > BugReport > Enable GPU Instancing
24. Repeat reproduce list from step 2

Compare the “CPU” and “Gfx.WaitForPresent” cost of the different profiling sessions. Observe that GPU Instancing is significantly slower than using no draw-call batching at all.

See overview.png for my test results.
3203618--244973--overview.png

Expected
GPU Instancing should not be slower than using no draw-call batching at all. Considering that it does seem to batch even more efficient than static batching from the numbers the Profiler is showing, I’d even expect that the test performs better with GPU Instancing than static batching.

Hey,

Does this issue happen on both Windows UWP and XboxOne?

I don’t know, I tested XboxOne only. If this information is important to you, I can test that.

Unity QA was able to reproduce the issue:

For some reason I’m unable to locate the bug-report in the Unity Issue Tracker though, would have linked it here otherwise.

Unfortunately the issue tracker is still broken.

Edit:

Apparently it has been fixed today and should work again.

@Peter77 Link to the bug report? I can’t find it and was about to implement gpu instancing in our UWP hololens application but may hold off if this is a platform issue.

@LeonhardP , bug-report Case 946966 continues to not exist in the public Issue Tracker. Could you take a look?

I just marked it to be available in issue tracker. It should show up shortly.

1 Like

It’s still not visible, seems problems with the public issue tracker continue to exist.

We’re on it :).

Edit:

And fixed.

https://issuetracker.unity3d.com/issues/uwp-xbox-one-gfx-dot-waitforpresent-takes-significantly-longer-with-gpu-instancing-than-no-batching-at-all

1 Like

Great, thanks! :slight_smile:

https://issuetracker.unity3d.com/issues/uwp-xbox-one-gfx-dot-waitforpresent-takes-significantly-longer-with-gpu-instancing-than-no-batching-at-all

1 Like

Hey. I took a look at this yesterday and the problem seems to be in the D3D11 driver for UWP. We Map/Unmap constant buffers for each of the instance batches, and while this works fine on PC, it seems to take forever on Xbox UWP only. I’m following up with Microsoft and will let you know when I hear back. I recommend you avoid instancing until they’re able to resolve this.

Good news! With newest Xbox update this seems to be fixed. I’m now getting 63 fps with instancing off and 135 fps with instancing off (I had to turn VSync off to go above 60 fps). No changes in Unity were needed ;).

1 Like

That’s excellent news, I have to give it a try later. Thanks for keeping us in the loop!

The test scene I submitted with Case 946966 does run with 60 fps (I didn’t turn off vsync), according to its fps overlay. Unfortunately, I can’t attach the Unity Profiler to it to get meaningful data, it always times out.

I’m going to test my actual project now.

EDIT: The reason why I couldn’t connect the profiler was I forgot to enable some networking Capabilities.

Do you mean with XOne firmware update or Unity update?

Thanks!

Xbox update. He wrote no changes to unity were needed.

My actual project also runs a lot better with the latest Xbox One update, so I assume GPU Instancing is indeed fixed. Awesome! :smile:

Hey Peter, what update of Xbox One are you using? We have tested our project with the update of February 6 but it is not running better.

Here is what Sysem > Console info shows on Xbox:

Has anyone tested this running in “Game” mode?
I don’t see the issue in “App” mode but when I switch to “Game” mode it runs very poorly.

Profiler

Using OS version: 10.0.16299.5101 (rs3_release_xbox_dev_1802.180131-1450) as mentioned.