Performant and Energy-Efficient Rendering with Render Graph and On-Tile Post Processing for Untethered XR in Unity 6.3

Rendering needs differ significantly across devices. A desktop GPU may consume hundreds of watts with large cooling systems, while phones, tablets and standalone XR headsets usually operate within only a few watts and rely on very limited cooling solutions. Even modern high end smartphones remain thermally constrained. Once these devices heat up, they quickly reduce performance to stay within safe limits.

Most mobile GPUs, including those in untethered XR devices, use a tile based rendering architecture. Work remains performant and efficient as long as it stays inside the fast but limited on chip tile buffer. Extra cost appears when the pipeline stores data to system memory, because this increases bandwidth usage. And this bandwidth produces heat, because electricity flows through the chip. Besides the temperature increase, the rendering can also become bandwidth bound, especially on low end mobile devices or in untethered XR where the amount of pixel information is very high due to high-resolution stereo rendering in front of the eyes.

How Unity 6 and the Render Graph Viewer help you stay efficient

Unity 6 uses the Render Graph system to automatically build an optimal pipeline configuration for the current frame. It merges compatible passes, removes unnecessary resources, and stays on tile whenever possible.

To understand what your frame is doing, the Render Graph Viewer is essential.

The Render Graph viewer shows:

  • Which passes can be merged into a single native render pass (blue line)
  • Which textures stay memoryless and which need stores or loads (filled vs. empty square on the left). The empty square is only visible on supported hardware with DX12, Vulkan or Metal.
  • Where load/store operations are happening, and bandwidth is increased (red triangle = store, green triangle = load)

Whenever you change settings in your render pipeline, you immediately see how the Render Graph adapts. You can now also connect the viewer directly to player builds running on devices. This lets you inspect Render Graph execution on real hardware, see how native render passes actually merge at runtime, and spot unnecessary load or store operations. Especially to validate if your project is using memoryless resources correctly is best inspected when connected to the real device (e.g. Meta Quest headset).

As a base for the following two examples, we use a very minimal base pipeline. The Render Graph viewer below shows this pipeline that directly renders to the backbuffer, through a single Native Render Pass.

Example 1: Depth Texture and pipeline changes

A simple checkbox can change how much memory bandwidth your rendering consumes. If you enable the Depth Texture option in the URP asset, the following happens:

  • The merged native pass splits, because URP needs to Copy Depth after the opaque pass. On most platforms we cannot read from the backbuffer.
  • URP now renders into intermediate textures (_CameraTargetAttachment, _CameraDepthAttachment) that live in main memory, which requires additional store and load operations.
  • The Blit Final To Back Buffer pass appears. This step has a potentially high cost on untethered XR and low end mobile hardware. For this reason, this feature is typically not recommended on those devices.

If you do not need the Depth Texture inside your transparent shaders, you can switch the depth copy to After Transparents. This allows the opaque, sky and transparent passes to merge again.

Example 2: Post Processing

If we enable post processing in our minimal pipeline scenario, the Render Graph changes significantly. Rendering switches back to the intermediate textures described in example 1, and post processing adds several additional passes. The post processing itself runs through an Uber shader inside the Blit Post Processing pass. Depending on the active effects, additional preparation passes may be required. In this example, Blit Bloom Mipmaps appears to create the resources for the bloom pyramid.

If we disable bloom, the graph becomes smaller again. The _CameraDepthAttachment is marked with an empty square, which means it is a Memoryless Texture Resource. It can stay inside the tile and does not require any load or store operation, so it does not consume bandwidth. If we enable effects that need to read depth from neighboring pixels, such as Depth of Field (DoF), this texture would also need to be stored in system memory.

Note that a Post Processing Final Blit Pass can also appear if any of the following is true:

  • FXAA is enabled. FXAA must run after all user authored passes, so it needs the final post pass.
  • Upscaling is enabled with a filter that cannot run inside UberPost. Only linear filtering works in UberPost, so filters like FSR require the final post pass.
  • TAA is enabled and uses standalone RCAS sharpening. When upscaling is not active, RCAS must run in the final post pass.

You can inspect these conditions inside URP’s source code here.

New in Unity 6.3: On-Tile Post Processing for Untethered XR

The principles above apply to all mobile devices. Untethered XR, however, has an additional challenge. Each eye often renders at a high resolution of around 2K by 2K, which makes these workloads very bandwidth bound. Any extra load or store has a significant cost, and since these devices target very high framerates, every additional load or store becomes even more expensive. This is why the goal is to keep the entire frame inside a single native render pass whenever possible.

Unity 6.3 introduces a new On-Tile Post Processing for Untethered XR renderer feature, developed in collaboration with Meta:

  • Runs supported post effects directly on tile: color grading, vignette, HDR rendering with tonemapping, dithering and film grain
  • Removes the final blit so the entire frame can run in a single native render pass
  • Avoids storing intermediate textures
  • Reduces thermal pressure by lowering memory bandwidth, which helps keep rendering stable during long XR sessions

Here you see comparison results between the previously not recommended post processing, and the new on-tile post processing on Meta Quest 3:


We can directly connect the Render Graph Viewer to the Meta Quest 3 device, which shows the following pipeline:

We see here an optimal configuration while still using the original post effects, which we previously recommended disabling on this platform. There is no final blit to the backbuffer, no intermediate color or depth stores, and everything runs inside a single native render pass.

Try it and share feedback

Please refer to the documentation to learn how to enable this feature in your untethered XR project. We would like to hear how this works in your projects, especially regarding performance and memory bandwidth. Feel free to also share images of your Render Graph Viewer connected to the device.

If you already integrated color grading or similar post adjustments directly into your shaders, we would like to understand your workflow and how this new feature might simplify your setup. Let us know how you solved this before, whether the on tile approach could fit your project and how you feel about HDR rendering in XR.

Graphics Talk from Unite Barcelona 2025

We recently presented a talk at Unite Barcelona titled “Glow up your graphics with Unity 6.3 LTS and beyond. Maximizing graphics performance and visual fidelity across platforms”. The video is now available and includes a section on energy-efficient rendering starting at 26:35 with live demos around this topic.

18 Likes

Sorry if I read over it, but is there a reason the on-tile PP is only for standalone XR?
To me it seems like all mobile devices with tile based GPUs would benefit

1 Like

We started with Untethered XR, because there we have the biggest need for it and the most direct benefit on performance. And the scope / device fragmentation is also much more limited, so it is safer to role this out first on a more manageable scale. But you are right, that most mobile devices would benefit.

We are actively working on it, and are very happy with our progress internally on that topic. We will share an update here, once it lands in a public alpha :slight_smile:

6 Likes

Is it compatible with foveated rendering?

1 Like

IS there any way of having bloom like PP without killing this??
most post processing and HDR color ranges and tonemappins is awesome already but i want to take in account the final colors for appling bloom at the very end or something, isnt that possible?

1 Like

Hi @oliverschnabel

I spent an hour trying to understand why I couldn’t get my _CameraTargetAttachment to be memory less and I finally realized that it was because I have MSAA enabled.

Is this the expected behaviour? Considering that for XR MSAA is pretty much a must, I would expect that frame buffer fetch should work with MSAA too.

To be clear, I’m not even talking about on tile post processing, even without any post processing if I have MSAA enabled Unity uses an intermediate texture and reads from it at the end

1 Like

Hi @Maverick3000
Are you looking at this in the editor or standalone desktop? There the backbuffer always has a single sample (no MSAA) and the intermediate textures are added. If you run it on device (Quest, Android Vulkan, ..), if you build development mode you can connect the RG viewer and see that it runs without the intermediate textures with an MSAA backbuffer.

1 Like

No. Bloom needs to sample neighboring pixels so it can’t work. You’ll need to fake bloom with other techniques.

3 Likes

Right…I should have thought about that. I was testing it in the editor with the Vulkan API. I didn’t consider that this could have been the reason because the camera depth texture was instead memory-less.

On a similar topic, is it possible to sample the back buffer with frame buffer fetch inside a compute shader?
The builder created inside AddComputePass or AddUnsafePass doesn’t implement SetInputAttachment.

For an unsafe pass I imagine it’s because unsafe passes can’t be merged. But is it also the case for compute passes?
What I’m trying to do is to implement an on-tile auto exposure. I need to run a compute shader that computes the current luminance, calculates how much adjustment is required to get to the desired exposure and apply it to the back buffer

By design, only RasterPass’ can be merged indeed.

1 Like

Isnt there any way to make like a postprocess pass ant the very end taking afventage of the final result outside of tile post processing? that way i can “mix” default bloom(at the end) and rest TilePostProcessing

Yes, foveated rendering is supported with On-Tile Post Processing.

Running a compute shader on the ActiveCameraColor always causes the texture to be transferred to device memory or is there a way to stay on tile memory?

Make please depth buffer memorless for Quest device. You can do it now, but you have to modify URP and turn off render graph validation, what is a little inconvenient.

3 Likes

Agree, would be great to be able to access on-tile depth for custom vfx passes.

2 Likes

It’s not possible.

What version are you using? There are nice improvements in this area in 6.3. Not sure it will exaclty line up with your needs though.

1 Like

What version are you using? There are nice improvements in this area in 6.3. Not sure it will exaclty line up with your needs though.

I use 6.3 and modified URP RenderObjectsPass with this builder.SetRenderAttachmentDepth(resourceData.activeDepthTexture, AccessFlags.Read);
builder.SetInputAttachment(resourceData.activeDepthTexture, index: 0, AccessFlags.Read); and other modifications to make shader graph works with depth. And I am turning off “Enable Validation Checks” for render graph. It works but it would be nicer if it works out of box to keep up to date URP more easy.

1 Like

Ah ok, you want to use depth as an input attachment. It doesn’t need to be memoryless for that.
That indeed doesn’t work, although we are working on that now for 6.5/6.6.
Likely it works on Vulkan indeed. The validation layer is now conservative and only allows things that work on all graphics APIs.

2 Likes