Introduction of Render Graph in the Universal Render Pipeline (URP)

I noticed that i do use one custom temporal AA together in the renderer and when disable it the performance closer matches the compatibility mode, so perhaps the issue is the Temporal AA itself than the main system.

Will investigate more on this and get back

@nasos_333 What are the “SAVE TEMP2” “SAVE TEMP” passes doing after your custom TAA pass?

1 Like

Hey @Qleenie

I compared the traces real quick and the RG enabled trace rendered 2 more surfaces.
1st surface is normal. This is the spacewarp depth+motionvec pass
Surface 3 | 480 x396 | color 64bit, depth 24bit, stencil 8 bit, MSAA 1, Mode: 3 (HwDirect) | 1 480x396 bins ( 1 rendered) | 0.15 ms | 2 stages : Binning : 0.018ms Render : 0.127ms

The 2nd surface looks suspicious.
Surface 9 | 1920x1584 | color 32bit, depth 24bit, stencil 8 bit, MSAA 1, Mode: 1 (HwBinning) | 45 384x176 bins ( 25 rendered) | 6.04 ms | 227 stages : Binning : 0.008ms LoadColor : 0.734ms Render : 0.597ms StoreColor : 0.521ms Preempt : 2.234ms LoadDepthStencil : 0.925ms StoreDepthStencil : 0.531ms

Look like RG enabled trace did additional work to handle 1 extra blit or fullscreen pass. And this is the main extra cost. We never saw it before, are you aware of any customization pass that behave differently RG on vs RG off?

To investigate further, we will need the renderdoc capture to analyze the frame and understand where this extra fullscreen pass is introduced.

Could you submit a bug and attach your project/apk files? This way we can receive the project and look into it more.

Thanks,
Thomas

Hi, will check on this and get back asap

I have an idea what it is, I am using a custom UberPost shader to allow postprocesssing with Passthrough.
Will run some tests and if positive, I’ll share the UberPost so you can check why it’s taking more time with render graph. Cannot submit project unfortunately.

I created a new project, copied all my settings into it, build a minimal scene with only one cube, did not use the UberPost hack I mentioned above (I tested, and it does not have measurable implications).

Performance difference is still drastic if post-processing is switched on between render graph switched on and off, measured on the GPU utilization, so lower is better:

Test with post-processing enabled on camera (no post processing effect active!)

  • 58 without render graph to 78 with render graph looking into the black void of the scene,
  • 78 without render graph compared to 95 with render graph looking at the grey cube, filling the full view

Now on a test with post-processing disabled on camera , the version with render graph actually performs better!

  • 43 without render graph to 44 with render graph lookin into void
  • 68 without render graph to 54 with render graph looking at cube

Also I saw 24 faces in the ovrgpuprofiler if post processing is switched on, compared to 6 if post processing is turned off, independent of the render graph setting. Again, no post processing effects turned on.

I submitted project: IN-89508

2 Likes

Hi,

Is it possible that URP will get access to the RenderLayerMask dropdown in shader graph, similar to the Sample Buffer Node in HDRP? Or alternatively is it possible to apply a full screen effect to a single layer with Render Graph/Render Layers?

I’m trying to implement a full screen effect on a single layer and I think this could solve that problem for me.

Thanks!

Thanks for this thread! If using RasterRenderPass it is not allowed to set global properties. That seems fine, I could just update a material property instead, but I noticed when making renderer features that if a material is used more than once, local material properties are not reliable even if set just before the Blitter call each time (they seem to conform to a single value for all blits that frame). What is the appropriate way to set shader values prior to blitting?

What is the difference between AddRasterRenderPass and AddRenderPass? When should either of these be used?

AddBlitPass doesn’t follow the same paradigm of using PassData and a GraphBuilder to execute commands. Is this just a shortcut method that does the work for you?

Is it okay to put multiple passes inside RecordRenderGraph?

You can enable it. This way, RenderGraph knows it can’t just cull the pass.

builder.AllowGlobalStateModification(true);

You should use Raster, Compute, or Unsafe pass in URP.
The initial version of RenderGraph was used by HDRP internally. We improve the API to have more guardrails before we made a public URP extension API. The old API is still there but we are migrating HDRP to the new API and will in a next step also expose the new extension API to HDRP, finally deprecating AddRenderPass.

Indeed, we call them helper passes that are built on top of the lower level Raster, Compute or Unsafe Pass. The AddBlitPass saves you the trouble of writing the boilerplate to do just a blit. And we’ll add more more optimizations later on that you’ll get automatically.

1 Like

Indeed, see tips and tricks local versus global.

1 Like

In case you missed it, here’s an overview of all the URP RenderGraph learning resources.

1 Like

@AljoshaD Apologies for the direct ping, just wanted to make my original question didn’t get lost. I haven’t found many docs on Render Graph/Render Layers with Shader Graph in URP, and was wondering if URP’s Sample Buffer Node will get access to the RenderLayerMask dropdown?

I’m trying to utilize Render Graph/Render Layers and implement a full screen effect on a single layer in URP, but it seems some of these options are not available yet in URP?

Thanks in advance!

Thanks a lot, I’ll go digest!

Hey, any insights from the project I submitted? For us it’s probably better to stick with 2022 LTS for now, but would be good to know if these performance issues are solvable in Unity 6/ render graph.

“EndRenderPass: Not inside a Renderpass BlitFinalToBackBuffer/Draw UIToolkit/uGUI Overlay”

This still seems to occur in the latest (6000.0.28f1) build. I have a similar setup of multiple cameras and a Windows build. Any ETA on the fix?

1 Like

Hey @girishd, is it with DX11/DX12 + MSAA + camera stacking? If so, this is a known issue, we should have a fix in the coming days/week. I will let you know when it lands.

Also, you should probably have more meaningful/specific errors logged before the one that you posted. This is one is quite generic (basically something wrong happened in your render pass execution). For your information, we are planning to improve exception handling and errors logging in RenderGraph soon, we were focusing on the “working” scenarios, but we are now looking at improving “non-working” scenarios to properly help users navigating them.

1 Like

Hi,

I do the below ping pong for Temporal AA

// Ping pong PING PONG
                // RenderTexture temp2 = temp;
                // temp = temp1;
                // temp1 = temp2;
                

                passName = "SAVE TEMP2";
                using (var builder = renderGraph.AddRasterRenderPass<PassData>(passName, out var passData))
                {
                    //passData.src = resourceData.activeColorTexture; //SOURCE TEXTURE
                    passData.src = resourceData.activeColorTexture;
                    desc.msaaSamples = 1; desc.depthBufferBits = 0;
                    //builder.UseTexture(passData.src, IBaseRenderGraphBuilder.AccessFlags.Read);
                    builder.UseTexture(_handleTAA, AccessFlags.Read);
                    builder.SetRenderAttachment(tmpBuffer1A, 0, AccessFlags.Write);
                    builder.AllowPassCulling(false);
                    passData.BlitMaterial = m_BlitMaterial;
                    builder.AllowGlobalStateModification(true);
                    builder.SetRenderFunc((PassData data, RasterGraphContext context) =>
                        ExecuteBlitPass(data, context, 2, _handleTAA));
                }
                passName = "SAVE TEMP";
                using (var builder = renderGraph.AddRasterRenderPass<PassData>(passName, out var passData))
                {
                    //passData.src = resourceData.activeColorTexture; //SOURCE TEXTURE
                    passData.src = resourceData.activeColorTexture;
                    desc.msaaSamples = 1; desc.depthBufferBits = 0;
                    //builder.UseTexture(passData.src, IBaseRenderGraphBuilder.AccessFlags.Read);
                    builder.UseTexture(_handleTAA2, AccessFlags.Read);
                    builder.SetRenderAttachment(_handleTAA, 0, AccessFlags.Write);
                    builder.AllowPassCulling(false);
                    passData.BlitMaterial = m_BlitMaterial;
                    builder.AllowGlobalStateModification(true);
                    builder.SetRenderFunc((PassData data, RasterGraphContext context) =>
                        ExecuteBlitPass(data, context, 2, _handleTAA2));
                }
                passName = "SAVE TEMP";
                using (var builder = renderGraph.AddRasterRenderPass<PassData>(passName, out var passData))
                {
                    //passData.src = resourceData.activeColorTexture; //SOURCE TEXTURE
                    passData.src = resourceData.activeColorTexture;
                    desc.msaaSamples = 1; desc.depthBufferBits = 0;
                    //builder.UseTexture(passData.src, IBaseRenderGraphBuilder.AccessFlags.Read);
                    builder.UseTexture(tmpBuffer1A, AccessFlags.Read);
                    builder.SetRenderAttachment(_handleTAA2, 0, AccessFlags.Write);
                    builder.AllowPassCulling(false);                   
                    builder.AllowGlobalStateModification(true);
                    passData.BlitMaterial = m_BlitMaterial;
                    builder.SetRenderFunc((PassData data, RasterGraphContext context) =>
                        ExecuteBlitPass(data, context, 2, tmpBuffer1A));
                }
                // END PING PONG

Could you also provide an example of procedural geometry generation being implemented with a render feature, perhaps using DrawProcedural and/or DrawProceduralIndirect?

For instance, procedurally generate an unlit (1,1,1) pink cube that sits at the world origin that rotates around its Y axis at 1 rev/sec. using its vertex shader.

I couldn’t find anything on this forum or in the manual that was comprehensively detailed enough to perform this task with the Render Graph. My own attempts resulted in the camera image turning flat grey.

EDIT: Ugh, never mind. It just started working for no reason at all.

Hey @nasos_333, thanks for sharing,

So:

  • first pass SAVE TEMP2 blits TAA to a temporary buffer tmpBuffer1A
  • second pass SAVE TEMP blits TAA2 to TAA
  • third pass SAVE TEMP blits tmpBuffer1A (old TAA) to TAA2

Is my understanding correct? I notice that you use the same m_BlitMaterial for those 3 blits. What does it do? Is it just a pure copy? If so, I have the impression that you are just swapping TAA and TAA2 textures through 3 blits costly on GPU, but you could simply swap their TextureHandles that you save in _handleTAA and _handleTAA2, i.e their indices in the Render Graph-owned texture buffer. So nothing changes on GPU side, it’s basically free. You could compare it to swapping pointers in C++ vs copying the buffers they point to.

Instead of those three blits, can you try to add at the end of your last TAA pass:

TextureHandle tempPrevTAA = _handleTAA;
_handleTAA = _handleTAA2;
_handleTAA2 = tempPrevTAA;

We do something a bit similar with URP color buffer in this sample.

1 Like