I noticed that i do use one custom temporal AA together in the renderer and when disable it the performance closer matches the compatibility mode, so perhaps the issue is the Temporal AA itself than the main system.
Look like RG enabled trace did additional work to handle 1 extra blit or fullscreen pass. And this is the main extra cost. We never saw it before, are you aware of any customization pass that behave differently RG on vs RG off?
To investigate further, we will need the renderdoc capture to analyze the frame and understand where this extra fullscreen pass is introduced.
Could you submit a bug and attach your project/apk files? This way we can receive the project and look into it more.
I have an idea what it is, I am using a custom UberPost shader to allow postprocesssing with Passthrough.
Will run some tests and if positive, Iâll share the UberPost so you can check why itâs taking more time with render graph. Cannot submit project unfortunately.
I created a new project, copied all my settings into it, build a minimal scene with only one cube, did not use the UberPost hack I mentioned above (I tested, and it does not have measurable implications).
Performance difference is still drastic if post-processing is switched on between render graph switched on and off, measured on the GPU utilization, so lower is better:
Test with post-processing enabled on camera (no post processing effect active!)
58 without render graph to 78 with render graph looking into the black void of the scene,
78 without render graph compared to 95 with render graph looking at the grey cube, filling the full view
Now on a test with post-processing disabled on camera , the version with render graph actually performs better!
43 without render graph to 44 with render graph lookin into void
68 without render graph to 54 with render graph looking at cube
Also I saw 24 faces in the ovrgpuprofiler if post processing is switched on, compared to 6 if post processing is turned off, independent of the render graph setting. Again, no post processing effects turned on.
Is it possible that URP will get access to the RenderLayerMask dropdown in shader graph, similar to the Sample Buffer Node in HDRP? Or alternatively is it possible to apply a full screen effect to a single layer with Render Graph/Render Layers?
Iâm trying to implement a full screen effect on a single layer and I think this could solve that problem for me.
Thanks for this thread! If using RasterRenderPass it is not allowed to set global properties. That seems fine, I could just update a material property instead, but I noticed when making renderer features that if a material is used more than once, local material properties are not reliable even if set just before the Blitter call each time (they seem to conform to a single value for all blits that frame). What is the appropriate way to set shader values prior to blitting?
What is the difference between AddRasterRenderPass and AddRenderPass? When should either of these be used?
AddBlitPass doesnât follow the same paradigm of using PassData and a GraphBuilder to execute commands. Is this just a shortcut method that does the work for you?
Is it okay to put multiple passes inside RecordRenderGraph?
You can enable it. This way, RenderGraph knows it canât just cull the pass.
builder.AllowGlobalStateModification(true);
You should use Raster, Compute, or Unsafe pass in URP.
The initial version of RenderGraph was used by HDRP internally. We improve the API to have more guardrails before we made a public URP extension API. The old API is still there but we are migrating HDRP to the new API and will in a next step also expose the new extension API to HDRP, finally deprecating AddRenderPass.
Indeed, we call them helper passes that are built on top of the lower level Raster, Compute or Unsafe Pass. The AddBlitPass saves you the trouble of writing the boilerplate to do just a blit. And weâll add more more optimizations later on that youâll get automatically.
@AljoshaD Apologies for the direct ping, just wanted to make my original question didnât get lost. I havenât found many docs on Render Graph/Render Layers with Shader Graph in URP, and was wondering if URPâs Sample Buffer Node will get access to the RenderLayerMask dropdown?
Iâm trying to utilize Render Graph/Render Layers and implement a full screen effect on a single layer in URP, but it seems some of these options are not available yet in URP?
Hey, any insights from the project I submitted? For us itâs probably better to stick with 2022 LTS for now, but would be good to know if these performance issues are solvable in Unity 6/ render graph.
Hey @girishd, is it with DX11/DX12 + MSAA + camera stacking? If so, this is a known issue, we should have a fix in the coming days/week. I will let you know when it lands.
Also, you should probably have more meaningful/specific errors logged before the one that you posted. This is one is quite generic (basically something wrong happened in your render pass execution). For your information, we are planning to improve exception handling and errors logging in RenderGraph soon, we were focusing on the âworkingâ scenarios, but we are now looking at improving ânon-workingâ scenarios to properly help users navigating them.
Could you also provide an example of procedural geometry generation being implemented with a render feature, perhaps using DrawProcedural and/or DrawProceduralIndirect?
For instance, procedurally generate an unlit (1,1,1) pink cube that sits at the world origin that rotates around its Y axis at 1 rev/sec. using its vertex shader.
I couldnât find anything on this forum or in the manual that was comprehensively detailed enough to perform this task with the Render Graph. My own attempts resulted in the camera image turning flat grey.
EDIT: Ugh, never mind. It just started working for no reason at all.
first pass SAVE TEMP2 blits TAA to a temporary buffer tmpBuffer1A
second pass SAVE TEMP blits TAA2 to TAA
third pass SAVE TEMP blits tmpBuffer1A (old TAA) to TAA2
Is my understanding correct? I notice that you use the same m_BlitMaterial for those 3 blits. What does it do? Is it just a pure copy? If so, I have the impression that you are just swapping TAA and TAA2 textures through 3 blits costly on GPU, but you could simply swap their TextureHandles that you save in _handleTAA and _handleTAA2, i.e their indices in the Render Graph-owned texture buffer. So nothing changes on GPU side, itâs basically free. You could compare it to swapping pointers in C++ vs copying the buffers they point to.
Instead of those three blits, can you try to add at the end of your last TAA pass: