Hello everyone!
I’ve been struggling with this topic for a while. I have a quite normal scene (Stats are: ~400k polys, ~200 batches and 2 realtime point lights) however the performance of it is not good as it runs at ~48 FPS.
So I went to the profiler and to me it is clear that the game is GPU bound.
Digging in the GPU section of the profiler’s hierarchy with deep profile on (It is the reason of the wide blue/script bands in the CPU) I see that the process that took the most is the ForwardOpaque. However, I do not know what to do with this information since I do not know what ForwardOpaque means and what subprocesses it has (I haven’t been able to find in-depth info about it in the docs).
So the questions are how do you usually approach these issues? Where can I find info about the processes shown in the profiler to find the root of the problem? I will really appreciate any help.
EDIT
That screenshot is misleading since I have “Deep Profile On”. Here you can see without it that the CPU is waiting for the GPU to finish:
ForwardOpaque/RenderLoop.ScheduleDraw
Rendering your scene meshes takes 30% - 5 [ms]
This is a lot but also ok IF your scene is complex indeed. You can slowly move these numbers down by optimizing the meshes, their shaders, batching things together (statically maybe) and improving occlusion to reduce number of meshes that batcher needs to process.
These numbers are unacceptable and also a easy to fix. Real low-hanging fruit you have here. No post process should take this much, seriously. Optimize this ASAP because disabling DOF & TAA will almost double your FPS (!)
Note: post processes` performance is often very sensitive to rendering resolution. Some post-fx offer options to operate on lower quality and/or resolution.
ColorPyramid and VolumetricLighting
If you don’t use these to great effect or can substitute them for something simpler - you may want to do that as well. Removing or optimizing them wont bring great gains but anything that takes over 1 [ms] should raise alarms in your head and be added to performance watchlist.
First of all, thank you a lot for taking the time to look to the data and your in-depth response. I was able to optimize the performance to ~80 FPS just by disabling and improving the post-processing settings. I have a few final questions and comments if I may:
ForwardOpaque/RenderLoop.ScheduleDraw taking 5 [ms]: the scene is not that complex so there should be indeed an optimization problem. However, the cause could be many things, as you said. Is there a way to go deep in the profiler? Or what are the usual strategies to investigate the issue and find some suspects? I ask this because when I will go talk to the artists I want to be more precise than just “optimize everything”. I am sorry if I am asking too much but this is my first 3D project
Just as a comment, I had the LitShaderMode set to Both and just by changing it to Forward Only or Deferred Only I had an improvement.
When you say These numbers are unacceptable and later comment that anything that takes over 1 [ms] should raise alarms I guess that it is like a general guideline for post-processing and camera rendering (Anti-Aliasing) right?. However, if I keep adding effects to the volume while keeping them optimized AKA < 1[ms] the total time of the post-processing keeps growing. Is there also a general guideline for the total time? E.G. if you have a post-processing taking ~5 [ms] then you should start disabling effects or it is okay while the total rendering time stays below the budget of 16 ms
Yes. “If something takes 1ms - it better be very important” - kind of rule of thumb.
Exactly, total GPU frame time is your most important metric. Think about this as a time budget; if you want to stay (for some target machine specs, resolution & quality level) at, lets say, 80 FPS then your time budget is 12.5 [ms].
What you do with your information then is you slice this time budget for all the GPU tasks according to how important they are in this given project (this will vary project to project).
For example:
rendering meshes @ 6ms (important)
post processes @ 3ms
feature A @ 0.4ms
feature B @ 0.4ms
feature C @ 0.2ms (unimportant)
spare time for other processes @ 2ms
After the budget is in place you enforce it in the project setting these as some kind of production targets. As your knowledge grows you will adjust these targets accordingly ofc.
This is where that easy part ends I am afraid. To make informed recommendation you need either a graphic programmer or at least a well rounded technical artist - it just too much information and variables to explain this in a single post. Also, I am not a HDRP user so can’t give you any shortcuts here, maybe other than:
Study RenderDoc’s diagrams until you understand what is happening on the GPU. It’s very similar to the rendering debug tool you can find inside Unity but this thing can also give you duration (approximations) for every action that took place there: