DirectX12 and Graphics Jobs Improvements in 2023.1 (Split Graphics Jobs and Editor Support)

Greetings from the Graphics Platforms team!

As some of you may know, we recently promoted the DirectX12 backend out of experimental state in 2022.2.0a17 , after introducing a significant mass of performance and stability improvements.

Some of the noticeable changes for Unity 2022 include the introduction of a new memory manager, to allow the over commission of resources beyond available GPU memory, as well as optimizations to the rendering setup for additional CPU performance.


Draw call Performance Test (x4000 draw calls, DX12 /w Graphics Jobs vs DX11)

Moving forward to Unity 2023.1, our team has been hard at work on additional optimizations and improvements, with the aim of matching or exceeding DX11 performance in the majority of use cases.

DirectX12 introduces exciting new capabilities and features, such as “Graphics Jobs” support for improved CPU utilization. When utilizing Native Graphics Jobs, the main thread processes and queues intermediate drawing commands, which the rendering thread converts into graphics API calls by launching worker threads that record and submit GPU command buffers to the queue.

Unity 2023.1a21 introduces a new Graphics Jobs Threading Mode called “Split Graphics Jobs”, which aims to reduce unnecessary begin/end of frame synchronization between the main and native graphics jobs threads, resulting in significant performance improvements. In our internal testing, we are observing meaningful CPU performance gains over DX11 when targeting DX12 using Split Graphics Jobs.


HDRP Sample Scene (DX12 /w Split Graphics Jobs vs DX12 & DX11)

Performance gains can become increasingly more pronounced in draw call heavy scenarios, such as our internal draw call performance test:


Draw call Performance Test (x4000 draw calls, DX12 /w Split Graphics Jobs vs DX11)

We are also planning to introduce Editor support for Graphics Jobs, in order to improve the project authoring experience via better scene-view and playmode rendering performance when targeting DirectX12. We are currently aiming to introduce DX12 Graphics Jobs support in the Editor in Unity 2023.2, with support for other platforms (Metal, Vulkan) later down the line.

While DX12 already exceeds DX11 in terms of CPU performance, DX11 may still come on top in GPU-heavy scenarios, due to significant driver optimizations for reducing GPU synchronization. With additional optimizations coming down the line, as well as the adoption of new DX12 API features, we aim to surpass DX11 performance in an ever-increasing number of scenarios.

With the DirectX12 backend now out of experimental, we will continue to closely monitor DX12 utilization and incoming user reports to address your valuable feedback, and eventually promote the DX12 backend as the default graphics API for Windows platforms.

20 Likes

Finally Dx12 getting some love, happy to see it!

Bindless textures would be awesome for srp batcher… Just saying =)

Does this imply that DXC will become default compiler soon? Didn’t hear anything about tht topic in a while

Also it would be nice if you added d3d12 as actual compile target for shaders. Not having it will make it harder to manage feature gap between dx11 and 12

Some bugs occur because of it too. For example usage of an array without predefined size such as “T b[ ]” will add #pragma that disables shader compilation for d3d11, where such things are unsupported, which also disables shader for d3d12

5 Likes

How does it effect latency? In my project that’s the most important metric, reducing options like QualitySettings.maxQueuedFrames (from default 2 to 1) pretty much halves FPS but improves latency.

I wish Unity would also test latency as that’s more difficult than fps (requires good camera).

2 Likes

Official support for the DXC shader compilation path and a d3d12 target is indeed in our roadmap, and we will share more information and updates soon :slight_smile:

We are definitely looking at latency as well, and have generally measured similar input latency between DX11/12 when setting maxQueuedFrames to default. We advise keeping maxQueuedFrames to default for the best balance between performance and latency. Setting maxQueuedFrames lower will reduce input latency as you observed, but will hurt performance.

The team will continue to monitor input latency as we introduce additional DX12 improvement!

4 Likes

Hi, it may potentially increase latency around half a frame or so. However when testing with camera (thankfully normal consumer phones can take 240fps) it didn’t measurably increase it (vs DX11). The effect of this is essentially the same as what driver thread in DX11 can do.

We still support the old native graphics jobs and there are no plans to deprecate them. So if you have extremely latency sensitive project you can indeed just use them + lower the maxQueudFrames and you can get into situation where your latency is better than with DX11. However our default settings are ones that match DX11 in latency.

5 Likes

Nice job

As the performance improvements seem to be quite substantial, is there any chance that split graphics jobs are going to be backported to 2022.3LTS ?

5 Likes

I am actually experiencing 50% to 100% slower performance using 2drenderer path over 2021.3 to 2023.1. I am wondering if this has to do with using different dx version in each unity…

In the future, it would be more helpful if you use milliseconds instead of frames per second in your posts, as fps can be a bit misleading at times. Other than that, I give you two thumbs up for dedicating more time to optimization.

4 Likes

We always just calculate the average frametime and the fps is just the inverse of that. So you can just take the reciprocal of our FPS measurerement. We are not doing the silly thing where one would calculate the arithmetic mean of FPS as that would indeed give wrong results. One either does arithmetic mean of frametimes or harmonic mean of FPS (they’re identical).

Hi,

Unfortunately there are no plans for backport as it does have some risk of bugs. Hence features like these are introduced during the tech stream to give us adequate time to test.

I tested DX12 split graphics jobs with Unity 2023.2.0a15 and the HDRP Sample Scene and got better performance using DX11 (~ 10-20% more performance).

Hardware:
CPU: i5 12600k
GPU: RTX 3060Ti
RAM: 32 GB DDR4
Mainboard: Z690
SSD used: external 2TB Kingston SSD

Did you test it in the editor or in a build, if I remember correctly, graphicsjobs is not used in the editor yet.

I tested it in a build - maybe the configured setting wasn’t applied due to a bug? Build performance difference was identical to editor performance difference.

You’re very likely GPU bound. HDRP template is bit heavy. DX11 driver is very good at removing unnecessary syncpoints and even reordering commands as necessary. Whereas our backends are of the immediate sort. Think of it like D3D11on12 which is also slower than real DX11.

Improving the GPU performance is next on our task list but as one might guess it’s more involved than the CPU side. Right now whatever C# scripts say happen at exactly that order and we execute them exactly as they come. It is possible to write the scripts so that it’s as fast in DX12 and Vulkan in our current backends as in DX11 but that’s more involved. Thus we must do a relatively large overhaul to introduce a system that doesn’t do exactly what is written, but just gets the end result right by reordering things as per dependencies like a good driver does it.

3 Likes

Do you have any guidelines to do this?
I understand you’re saying we should reduce sync points but its hard to know where exactly those occur since unity makes them for you

Hey, this is not exactly the right thread but I don’t know where else to post it.

DirectX12 is adding work graphs which could add some really nice “free performance” for Unity! It’s probably already on your radar, but if it isn’t: cheers.

2 Likes

Hi. When official plan to support split graphics job at mobile platform for Android and iOS player runtime? Currently due to this graphics rendering stuck on main thread issue, it loses a lot of performance that spends a lot of time on main thread.

This is the roadmap card for supporting it on Metal (and thus iOS)
https://portal.productboard.com/unity/1-unity-platform-rendering-visual-effects/c/1955-split-graphics-jobs-metal?utm_medium=social&utm_source=portal_share

You can vote on it to get it higher priority.

Make sure with profiling on your native device that this is what you are losing performance on!

As well, it will probably be a long time until this free performance hits, so unfortunately until then you will have the found other ways to optimize. Consider “CombineMesh” method to combine your static meshes into 1, lowering your LOD bias or lowering your max texture size: but for all of these, PROFILE!!! Otherwise you’re just shooting in the dark.

Sorry if you are already doing that and I come off as aggressive, a lot of people don’t profile even in situations where it is absolutely critical.