Prevent shader compilation stutters with PSO Tracing in Unity 6

Greetings from the Unity Graphics team.

Unity 6 introduces a new and powerful Pipeline State Object (PSO) tracing and precooking workflow, to achieve a smoother and stutter-less player experience.

This set of APIs provide a significant upgrade from the “shader warmup" API, previously introduced in older Unity versions. While traditional shader warmup is sufficient for older graphics APIs (e.g. OpenGL, DirectX11), the new PSO workflow can better utilize modern graphics APIs, such as Vulkan, DirectX12 and Metal.


PSO Creation and Caching

When targeting modern graphics APIs, the GPU vendor’s graphics drivers will perform runtime shader compilation (and other rendering state translation) as part of the Pipeline State Object creation process. As a result, PSO creation is a lengthy process, which may lead to noticeable stutters in the runtime application. This overhead can be exacerbated for more-complex projects, which require the application to compile a large amount of PSOs on the fly.

You can identify PSO creation stutters in the Unity Profiler, using the GraphicsPipelineImpl markers:

In many cases, the GPU vendor’s graphics drivers will automatically cache any compiled PSO to disk, in order to accelerate PSO creation for subsequent application runs. However, the application may still need to compile PSOs for newly encountered shader variants and materials. Furthermore, OS and driver updates will often invalidate the driver-managed PSO cache.

To eliminate PSO stutters for initial (and subsequent) application runs, it is recommended to trace any PSOs required by a certain level and or scene, and ensure they are pre-compiled ahead of their first use when drawing the scene. Effective use of PSO precooking will improve the player’s experience, and leads to a better first impression:

The ideal way to warm PSOs may vary depending on your application and use case. For example, you may choose to synchronously precook PSOs during level transitions and scene loading. This can be done progressively (time-sliced) in order to increase application responsiveness, creating a fixed amount of PSOs per frame.

Alternatively, you can choose to asynchronously precook PSOs in the application’s background. This will not block the application, but may temporarily regress CPU performance for the duration of the warm up.



Tracing a new PSO Collection

PSO tracing should be performed using a development build. You can write C# scripts to collect the PSOs created by the application during rendering:

  1. In a C# script, create a new GraphicsStateCollection. This collection corresponds to your application’s or scene’s PSOs.

  2. To begin tracing PSOs into your collection, call the GraphicsStateCollection.BeginTrace method. Any new graphics pipelines created by the application will be added to the collection. In most cases, you should begin tracing during scene or application start up.

  3. To finalize the tracing process, call the GraphicsStateCollection.EndTrace method. In most cases, you should end tracing during scene or application end.


Saving a PSO Collection

Once tracing is complete, you can save the PSO collection to disk using GraphicsStateCollection.SaveToFile. This stores the file at the provided path on the player’s device:

string path = Application.persistentDataPath + "/" + "Scene.graphicsstate";
m_GraphicsStateCollection.SaveToFile(path);
Debug.Log("Sending GSC to path: " + path);

When deploying the player on another device (e.g. Android, iOS), you will need to manually retrieve the file back to the Editor’s machine. For example, using adb pull on Android.

Alternatively, you can use GraphicsStateCollection.SendToEditor to retrieve the PSO collection using Player connection. Before building the project, make sure you enable both the “Development Build” and “Autoconnect Profiler” in the Build Profile settings:

Note: manually connecting the console from the editor to the local player also works to establish the connection.


Precooking a PSO Collection

Once tracing is completed, you can request Unity to precook PSO collection, ideally well ahead of drawing time. In most cases, the ideal time to perform warmup is during application or scene loading sequences.

You can perform PSO precooking using two warm up methods:

Both methods will return a job handle, which can be used to determine whether PSO Warmup is performed synchronously or asynchronously.

Once the PSOs are created, the drivers will often cache those to disk. Next time the PSOs are precooked, they could be loaded directly from cache.


Inspecting and modifying a PSO Collection

You can query the platform used when tracing the PSO collection via GraphicsStateCollection.runtimePlatform.

For additional control, you can inspect the recorded PSOs and variant data, and modify the collection as needed. GraphicsStateCollection.GetVariants can be used to retrieve all shader variants recorded in a PSO collection. You can then read the graphics states used by each variant via GraphicsStateCollection.GetGraphicsStatesForVariant. Lastly, you can modify the graphics states associated with each variant using AddGraphicsStateForVariant / RemoveGraphicsStatesForVariant.

Note: GPU representation of the PSO may vary across platforms. It is highly recommended to perform tracing in a development build, targeting the relevant graphics API. And maintain separate collections per target.



Sample project (Update: 13.01.2025)

You can experiment with PSO tracing via the URP 3D Sample project, obtainable from the Unity Hub. As of version 17.0.6, the project includes the following scripts:

Assets/SharedAssets/Scripts/Runtime/GraphicsStateCollectionManager.cs

This is the main script, which performs PSO tracing and warmup at runtime, and holds a list of the recorded collections.

Assets/SharedAssets/Scripts/Editor/GraphicsStateCollectionStripper.cs

This script ensures the build only includes the relevant collections for the target platform. We recommend tracing a separate collection per graphics API, since graphics states vary per API.

The sample project targets multiple platforms and quality levels, so it includes multiple collections under the “Assets/SharedAssets/GraphicsStateCollections” folder. Every collection can take a few MBs of disk space, so this filtering script is needed to minimize build size.

Assets/SharedAssets/Scripts/Editor/GraphicsStateCollectionCombiner.cs

This utility script combines multiple selected collection files into one. The first collection you selected will be used as the result collection. Note that it will only combine collections that match platform + gfx api + quality level.


To enable PSO warmup for the sample project, follow these steps:

  • Open the main scene located at “Assets/Scenes/Terminal/TerminalScene.unity”
  • Add an empty game object to the scene, and assign the Graphics State Collection Manager component
  • In the inspector, set the Mode property to “Warmup”
  • Right click on the component and select “Update collection list”:

  • Save the scene and project. Build and run standalone player targeting Windows (DX12/Vulkan) or OSX (Metal).

PSO warmup should now be performed automatically, when loading the Terminal scene in the Player.

Note: If you previously ran the URP 3D Sample project, your graphics drivers may have already cached some shaders and PSOs locally. To better test the new API, we recommend you first clear your shader cache. The steps for this vary based on platform. For example, on Windows (DX12) using an Nvidia GPU, you may find the shader cache in the following folders: \Users\<Username>\AppData\LocalLow\NVIDIA\PerDriverVersion\DXCache, \Users\<Username>\AppData\Local\D3DSCache.



Platform support

The new PSO workflow is available as of Unity 6, for Players targeting Metal, Vulkan and Direct3D12.

Starting with Unity 6.1, we also provide compatibility for older graphics APIs such as OpenGLES and Direct3D11, via a fallback to legacy shader-warmup.

Please give the new PSO precooking workflow a try and let us know what you think!

As always, you can follow our progress and discover new features via the public roadmap. If you cannot find the feature you are looking for, feel free to submit a feature request, or contact the team directly in the graphics forum.

27 Likes

How does it compare with Unreal’s PSO caching?

Questions
1.If I understand this correctly, to gather the PSO to collection, we need to perform all actions in the scene that can create the PSO. just ‘looking’ at new stuff only work in a simple case. EG, we might have Full-screen pass when low HP or when players receive debuff, or a VFX graph for skill effect. We must spawn these so the tracing can collect this data into the collection right?

2.Does this the trace file created specific to each platform eg windows/mac/anroid/ios/ps5 or it’s more granular than that? Do we need to collect different file for windows with NVDIA card and windows with AMD card?

1 Like

The biggest difference is that GraphicsStateCollections are scriptable, and it’s possible inspect and modify the contents of the PSOs via scripts. I believe Unreal also recently introduced a form of automated PSO gathering for DX12. We hope to introduce a similar concept in the future.

  1. The tracing function will collect graphics states for PSOs that were created at runtime, during the time of tracing. PSOs are created and cached when rendering new shader variants (such as when drawing new materials). So indeed, you will need to make sure the relevant materials and effects are rendered when tracing.

  2. The relevant graphics states may differ across graphics APIs. So we recommend tracing in the Player, targeting the relevant gfx API. You can query the platform used when tracing the PSO collection via GraphicsStateCollection.runtimePlatform. The GPU vendor and driver implementation does not matter in this case, since the Gfx API calls submitted for PSO creation are the same (as long as you are targeting the same graphics API).

4 Likes

Sorry if I don’t understand this, I’m not familiar with this stuff.

What happens when we render materials before the warm up, is the rendering just more laggy?
So would the workflow work like:

  • Check if we have a tracing saved to disk
  • If there isn’t one we call BeginTrace
  • We wait a little while so that the player can look at most of our materials
  • One we have enough we save our tracing
  • Then we begin the warm up which should be fast if the drivers already cached it

It’s what I understood, but maybe I’m completely wrong idk.

In any case it’s good stuff, Unity 6 is making some big improvements to graphic performance.

1 Like

Hiccups is less a thing in Unity, maybe because we’re so used to adding our shaders to the pre-compile list. but for dx12 and vulkan that will be nice. Will be staying with GLES3 until then.

Unreal struggles without caching, BIRP is fine for some reason, but I think URP hiccups a lot, on Switch it was necessary to pre compile them. Which puzzles me: why this new system when pre compiling worked fine? Does it make more sense on AAA titles with tons of shaders?
I end up keeping tight manual control of lighting and use a handful of uber shader so maybe on large productions with tech artists creating 100 shaders this becomes necessary.

1 Like

When you render newly encountered materials and variants, the engine will trigger Pipeline State Object (PSO) compilation on modern graphics APIs. This can be a lengthy process, and will introduce noticeable hitches/stuttering. Especially for the initial application run, but also whenever players encounter new materials/shaders and thus new PSOs.

During the development process, we recommend you use the new GraphicsStateCollections API to trace the PSOs created at runtime by your application. Do this in a built Player, targeting your relevant graphics API.

To ensure you collect as many PSOs as possible, we recommend you play/fly through your content, and execute as many paths as possible. Aim to trigger the rendering of all relevant materials/variants and effects.

When tracing, it’s also recommend you switch between the graphics settings that would be available to players at runtime. Using different graphics settings may load different shader variants. Which would trigger the creation of new PSOs.

Once you are done tracing, send the collection to the Editor using PlayerConnection and save to disk. Then, write a script which warms this collection at application start up, scene load, or other point in time (this may depend on your content). This will trigger the creation and caching of PSOs.

The “Preload Shaders” setting will use Shader Warmup, to automatically pre-warm the Shader Variant Collections you specify in the Graphics settings. However, Shader Warmup is not fully supported on modern graphics APIs (DX12, Vulkan, Metal). It lacks the graphics states needed to create the correct PSOs.

We recommend you use the new GraphicsStateCollection API when targeting modern graphics APIs instead. This would allow you to trace and warm the correct PSOs. We aim to also provide a fallback path of older graphics APIs, so you could simply use the GraphicsStateCollection workflow across all relevant platforms.

5 Likes

Which version will the API land in? Or is it already in 6000.0.7f1?

Is there not an easier way to do this?
Like an automated way, just enable a check box in player settings like garbage collection is a checkbox there. I’m asking this for non programmers.

1 Like

GLES3.0 (or at least 3.1 or something) is very much needed since Vulkan on Android is asking for crashes and 1 star reviews on Google Play.

When you say “coming versions” is that something within the Unity 6.x lifecycle, or are we talking Unity 7+?

Also, I don’t see anything SRP specific in all of this, but I’m just making sure: This works for BIRP too, right?

1 Like

We want to introduce more automation for PSO gathering in the future.

Similarly to the existing ShaderVariantCollection Warmup workflow, we could also introduce UI to save a GraphicsStateCollection in the Editor, for PSOs created by the Editor’s Play mode. Along with UI for preloading PSO collections. This sounds useful, so we will log this as a feature request. The relevant graphics states vary across graphics APIs, so you will need to use the same gfx API in Editor and Player for this to work though.

Note that players will encounter additional PSOs that are not rendered in the Editor. Even with UI and automation, we would still recommend using the tracing API for maximum possible coverage of PSOs.

Indeed, we are aiming to provide a fallback path for GLES and DX11 (using shader warmup) in the Unity 6 lifecycle. This API is render pipeline agnostic, so will work for both BiRP and URP.

4 Likes

I am considering integrating PSO tracing into my game through a game mode where the game automatically plays itself and saves the collection at the end of each level.

However, a challenge arises due to the game offering settings that allow players to adjust graphics, such as toggling shadows.

To collect PSOs with different settings, the game would need to play each level multiple times - once with shadows on and another with shadows off.

Given that players can adjust various graphics settings that impact shader variations, I am exploring ways to incorporate these variations into the PSO collection process more efficiently.

I wonder if there is a method to instruct the system to capture both variations (shadow on/off) simultaneously rather than playing each level twice.

The rationale behind this is that the current method of playing the game to collect PSOs becomes inefficient as more graphic settings are introduced. For instance, adjusting two settings would require playing the level four times, three settings would mean playing eight times, and so on.

I wonder if there is a way to tell the PSO collection to “capture each PSO with these additional shader variations” in a more streamlined manner.

What’s the recommended way to handle game graphics settings?

3 Likes

The ideal way to trace and precook PSOs may vary based of your application. Here are some example approaches:

  • Trace a PSO collection for each quality level: Play through the content using the relevant quality level and URP/HDRP renderer asset active. Make sure all materials/effects are rendered, and trace the PSOs to a collection. Then precook the relevant collection when players switch a quality setting.
  • Trace a large PSO collection for an entire level or game : Play through the content, again making sure to render all materials/effects, but also aim to switch between all graphics settings available at runtime. You can do this via scripts/UI, switching between SRP assets, volume settings, etc… Trace the PSOs to a collection, and precook at the start of the level/game.

To automate the process, you can also create a PSO capture scene, where you setup a camera to render all project materials and effects. Then write a script which performs PSO tracing, and switches between runtime graphics settings.

For more precise control, you can copy a recorded collection and modify the shader variant information from a script.

1 Like

Why can’t this be done automatically? i.e. (unity gets all shaders/effects and caches them)

1 Like

I suppose it’s because prewarming all the assets found in a project wouldn’t work well because the driver’s eviction policy is far more likely to kick PSOs, some that you might have needed for the immediate scene in which case they would have to bake them a second time.

As for why can’t it be done automatically for each scene, yes it should be possible for the elements in the scene; but you’d have to do something about dynamically created PSOs e.g. from a script instantiating using an asset address only when the player does something, or for large worlds that feature object streaming in which case you need to somehow tell your streaming API to load everything (if your workstation’s RAM allows it) or load all chunks in sequence - something that is hard create a generic solution for since everyone does this stuff their own way.

I think there will always be some amount of manual effort for each project - but nevertheless a Right-Click → Bake on Scene or Prefab or Directory to generate a PSO manifest from the static content, plus a Right-Click → Merge of two manifests would be a good step forward.

If you could provide the optimized version of the project shown in the example video using PSO, along with the scripts, we would have a much clearer understanding of how the system works and be able to figure out how to use it. The documentation seems a bit weak in this regard. We kindly request a demo project that won’t require too much effort on your part.

Unpinning all these announcements takes forever :smiley:

So on the old graphics API, should I still continue to use shader warm-up? And warm-up with this PSO on the new graphics API? Am I understanding correctly?