Rendering the provided test-scene on Xbox One (UWP, retail console using developer mode) with “GPU Instancing” enabled, renders significantly slower than using no draw-call batching at all, even though Unity does batch a lot of draw-calls when using GPU Instancing.
Open Player Settings and turn off Static and Dynamic Batching
Press File/Build Setting
Select “Universal Windows Platform” tab
Turn on “Development Build” (to be able to attach the Unity Profiler)
Build UWP Player with setting Target=Any Device, Build Type=D3D
Open generated .sln file in Visual Studio 2017
Visual Studio
8. Switch “Solution Configuration” to “Release”
9. Switch “Solution Platforms” to “x64”
10. Switch “Local Machine” to “Remote Machine”
11. Open the “Wolf (Univeral Windows)” Project Properties
12. Select the “Debugging” tab in the “Wolf Property Pages” dialog
13. Set the “Machine Name” to your Xbox One name or IP address and press OK button
14. Open file “Package.appxmanifest”
15. Switch to “Capabilities” tab in “Package.appxmanifest”
16. Enable the following Capabilities (not sure which of these is actually responsible to make the Profiler work): Internet (Client & Server), Internet (Client), Point of Service, Private Networks (Client & Server), Remove System
17. Press in Mainmenu > Build > Build Solution
18. Run on Xbox One
Unity
19. Open Window > Profiler
20. Connect to running Player on Xbox One
21. Note the “CPU” and “Gfx.WaitForPresent” cost
22. Close Profiler
23. Execute Mainmenu > BugReport > Enable GPU Instancing
24. Repeat reproduce list from step 2
Compare the “CPU” and “Gfx.WaitForPresent” cost of the different profiling sessions. Observe that GPU Instancing is significantly slower than using no draw-call batching at all.
See overview.png for my test results.
Expected
GPU Instancing should not be slower than using no draw-call batching at all. Considering that it does seem to batch even more efficient than static batching from the numbers the Profiler is showing, I’d even expect that the test performs better with GPU Instancing than static batching.
@Peter77 Link to the bug report? I can’t find it and was about to implement gpu instancing in our UWP hololens application but may hold off if this is a platform issue.
Hey. I took a look at this yesterday and the problem seems to be in the D3D11 driver for UWP. We Map/Unmap constant buffers for each of the instance batches, and while this works fine on PC, it seems to take forever on Xbox UWP only. I’m following up with Microsoft and will let you know when I hear back. I recommend you avoid instancing until they’re able to resolve this.
Good news! With newest Xbox update this seems to be fixed. I’m now getting 63 fps with instancing off and 135 fps with instancing off (I had to turn VSync off to go above 60 fps). No changes in Unity were needed ;).
The test scene I submitted with Case 946966 does run with 60 fps (I didn’t turn off vsync), according to its fps overlay. Unfortunately, I can’t attach the Unity Profiler to it to get meaningful data, it always times out.
I’m going to test my actual project now.
EDIT: The reason why I couldn’t connect the profiler was I forgot to enable some networking Capabilities.