Hello [mention|t2G9XqT1+DL/oNOrA9RaLg==] (tagging you because of this very useful post , and I was wonder what might have changed since July),
I’m using HDRP trying to get some GPU profiler stats at runtime but haven’t been having any luck as of yet. (I’m using 2020.2.4, HDRP 10.3.1 on Windows.)
When I turn off GraphicsJobs to allow GPU profiling I can see how long each step takes when by drilling down the hierarchy in the GPU profiler. But this takes a bunch of time to individually find/select each Post Processing step, and then I need to manually reselect it every time I shift frames… Which is why I’m trying to do it with ProfilerRecorders or possible the old Recorders.
I’ve set up a number of ProfilerRecorders which I have running in editor and in dev builds so I can see how much time the post processing stack is taking. The problem I’m getting is that I only seem to be able to get the CPU times for these calls.
I’m aware with the old profiler system there was a option for getting gpu time with gpuElapsedNanoseconds API. However when I try to use those I have no way for knowing what of the 3.5k+ available tags (name lists retrieved with this and this) are meant to work with a GPU.
The only setting I can find around collecting GPU times is when creating a CustomSampler – but I’m not sure how I theoretically could create one to cover already existing HDPR code… without editing the HDRP package directly…
I was wondering if:
Is it possible to determine if an already existing profiler sample records CPU vs GPU time? (Are all of the profiler names still just for CPU code or do only custom samplers support GPU timings still ?)
What if the profiler names are the same for the GPU as they are in the CPU? (E.g. CPU Bloom vs GPU Bloom) How would those samplers be differentiated?
Is there (or will there eventually be) any ways to specifically try to get GPU times instead of CPU times with the ProfilerRecorders as there is with Recorders?
Is there any ways to get the overall GPU time from ProfilerRecorder or Recorder – or from any other method when using HDRP? (What is shown at the middle of the Profiler window next to CPU time.)
That issue is addresses in 2021.1 and sadly is too big and risky a set of changes to backport it. But when profiling a build, you can totally use an editor to open an empty project and use it’s Profiler to profile your 2020.2 build. Then it’s: select once, see the selection in every view and frame.
At this moment, only Recorder API can record GPU timings. In 2021.2, it can now do so on all platforms. Making ProfilerRecorder API also work with GPU timings is still in progress.
The same marker can contain both CPU and GPU times. With Recorder API, CPU times are then reported via elapsedNanoseconds and GPU timings via gpuElapsedNanoseconds.
FrameTimingManager can provide some such measurements on some platforms and graphics APIs. Now that we got GPU measurements implemented on all of these and exposed through the Recorder API for 2021.2, aligning FrameTimingManager and ProfilerRecorder to use these new measurement capabilites is up next.
Ok I’ve made a testing script which goes through all of the available Reportes and determines which ones are currently report GPU times. So far I’ve tested it in 2020.2 (editor), 2020.2 (build) and 2021.1.b6 (editor): only in 2020.2 (editor) am I seeing any reported GPU times, and then just the following ten:
This is actually a great improvement for me!
Question)
Is it expected to only get such a few number of (built in) working GPU tags in 2020.2?
Is it likely that we’ll be getting access to more working GPU profiler tags with subsequent releases of 2020.2 LTS or 2021.1 when it’s out of beta?
Is the lack of any currently working tags in 2021.1 with HDRP 11 because this is still a feature in development and it’ll be filled in time? (Potentially a long time, as you said earlier like 2021.2?)
Source code for my testing script (CCO)
[Edit: For anyone wanting to use this code I posted an updated version of it below which works much better → Returns 50+ recorders instead of around 10.]
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.Profiling;
/*
* Copyright 2021 By Colin Leet (https://leet.games/)
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain.We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors.We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to<http://unlicense.org/>
*/
namespace LeetProfiling {
/// <summary>
/// This is a testing script for find out which of Unity's samplers can report GPU times on your given platform.
///
/// This script creates recorders for all of available <see cref="Sampler.GetNames(List{string})"/>.
/// It then runs for a bit and checks which recorders report positive values for <see cref="Recorder.gpuElapsedNanoseconds"/>.
///
/// Note: Only the editor and development builds can use Recorders as of Unity 2020.2
/// </summary>
public class TestGPURecorders : MonoBehaviour {
[Tooltip("Enables the recorder GPU testing.")]
public bool RunTestInStart = true;
[Tooltip("Seconds which will be waited before test for GPU profilers.")]
[Range(1, 30f)]
public float WaitTimeBeforeTest = 5f;
[Header("Runtime Test Results")]
[Tooltip("Number of recorders which was retrieved with Sampler.GetNames")]
public int NumRecorders = 0;
[Tooltip("Number of recorders which reported being valid.")]
public int ValidRecorders = 0;
[Tooltip("Number of recorders which has positive GPU times.")]
public int ActiveGPURecorders = 0;
/// <summary>
/// List of names generated from <see cref="Sampler.GetNames(List{string})"/>.
/// </summary>
private List<string> allNames = new List<string>();
/// <summary>
/// List of all recorders generated from <see cref="Sampler.GetNames(List{string})"/>.
/// </summary>
private List<Recorder> allRecorders = new List<Recorder>();
/// <summary>
/// List of the keys for <see cref="GpuRecorders"/>.
/// </summary>
public List<string> GpuNamesActive = new List<string>();
/// <summary>
/// This is a dictionary of all of the activley recording GPU recorders.
/// </summary>
public Dictionary<string, Recorder> GpuRecorders = new Dictionary<string, Recorder>();
private void Start() {
if ( RunTestInStart ) {
// Init the recorder vars.
InitiateAllRecorders();
StartCoroutine(WaitBeforeTest());
}
}
/// <summary>
/// Initiates all of the supported recorders.
/// </summary>
public void InitiateAllRecorders() {
// Only run in editor or debug builds.
if ( !( Application.isEditor || Debug.isDebugBuild ) ) return;
allNames.Clear();
allRecorders.Clear();
GpuNamesActive.Clear();
GpuRecorders.Clear();
Sampler.GetNames(allNames);
NumRecorders = allNames.Count;
int lenRecorders = allNames.Count;
for ( int i = 0; i < lenRecorders; i++ ) {
allRecorders.Add(Recorder.Get(allNames[i]));
allRecorders[i].enabled = true;
}
}
/// <summary>
/// Waits a period then runs the test, then logs the results.
/// </summary>
public IEnumerator WaitBeforeTest() {
// Only run in editor or debug builds.
if ( !( Application.isEditor || Debug.isDebugBuild ) ) yield break;
// Wait for the scene to get going.
yield return new WaitForSecondsRealtime(WaitTimeBeforeTest);
TestGPURecordingProfilers();
yield return null;
LogResults();
}
/// <summary>
/// Tests which GPU times have positive values.
/// This may be called multiple times outside of <see cref="WaitBeforeTest"/>.
/// </summary>
public void TestGPURecordingProfilers() {
ValidRecorders = 0;
ActiveGPURecorders = 0;
// Determine all of the active gpu recorders
int lenRecorders = allNames.Count;
for ( int i = 0; i < lenRecorders; i++ ) {
if ( allRecorders[i].isValid ) {
ValidRecorders++;
if ( allRecorders[i].gpuElapsedNanoseconds > 0 ) {
if ( !GpuRecorders.ContainsKey(allNames[i]) ) {
GpuNamesActive.Add(allNames[i]);
GpuRecorders[allNames[i]] = allRecorders[i];
ActiveGPURecorders++;
}
}
}
}
}
/// <summary>
/// Lists all of the active gpu recorders.
/// </summary>
public void LogResults() {
Debug.LogFormat("{0} Recorders are reporting positive times for the GPU... Listing <{0}>: ", ActiveGPURecorders);
int countActive = GpuNamesActive.Count;
for ( int i = 0; i < countActive; i++ ) {
Debug.LogFormat("{0}: {1:N5} ms",
GpuNamesActive[i],
GpuRecorders[GpuNamesActive[i]].gpuElapsedNanoseconds * 1e-6);
}
}
}
}
No, I suspect you need to wait some ~3 frames after startup for the GPU Samplers to have fully registered with the Profiler so that they’ll actually show up when calling Sampler.GetNames. In the Editor, some markers might get used before you enter playmode or in a previous run in the same Editor session so they would already be known to the system.
I don’t see a likely chance for new GPU marker additions in the base Unity Releases for these versions. Also, the base version only has very few GPU markers to begin with. That said, the relevant releases for GPU markers aren’t tied to the base editor & player Unity version but to the SRPs’ version and I don’t know what their plans are on that topic.
As I mentioned above, the test scenario you set up might be missing some markers. Searching the HDRP packaged C# code for “CommandBuffer.BeginSample” usages might give you a fuller picture, also in cases where that particular Profiler Marker was just not hit in the frame you are checking against, as not all frames are necessarily using all features.
That said, higher coverage of platforms and graphics APIs as well as linking stuff up with FrameTimingManager and the current focus we have on this might lead to further additions down the road, likely only for 2021.2 and up though as all other versions are already branched off and stabilizing. But that is a bit of conjecture on my part.
Also, ProfilerRecorderHandle.GetAvailable should also include the marker names for the GPU enabled samplers, same as Sampler.GetNames, but ProfilerRecorderHandle.GetAvailable also gives you the available Rendering Counters). Since the API the GPU markers are using doesn’t allow for setting a category, they’d all get lumped into the Scripting category until we add GPU capabilities to ProfilerMarker API.
Oh ok that makes a lot of sense! I thought the three frames delay was in relation to waiting till after a recorder was fully created and enabled from a script (not also in relation to needed to wait the frames for the names to be added to Sampler.GetNames). I’ll modify my script to allow for that in debug builds.
Also just to confirm)
Does initiation in this context just mean waiting three(ish) frames after level launch for the Sampler.GetNames to fully populate? Or does it also include needing to wait the frames after setting a particular: “recorder.Enabled = true”.
And/or waiting three frames after first querying an already enabled recorder with: “Recorder.gpuElapsedNanoseconds”.
That’s why I added a variable for the length of the delay after creating/enabling the recorders allow any start to behavior on other scripts to “warm up” before testing if their GPU recorders were active.
Ok I’ve updated my testing script and now I’m seeing around 51 available profilers (when using VR), with near identical numbers of GPU profilers in editor and dev builds! This is SO AWSOME!
Also it’s probably worth noting that I did see an small increase (2-3) in the number of available profiler recorders between testing them at round 8 and 13 seconds after level load (I didn’t bother to figure out which ones for now). For anyone reading, if your Recorder isn’t reporting immediately you might just need to wait a few more seconds for it to become active.
Here it the updated code:
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.Profiling;
/*
* Copyright 2021 By Colin Leet (https://leet.games/)
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain.We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors.We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to<http://unlicense.org/>
*/
namespace LeetProfiling {
/// <summary>
/// This is a testing script for find out which of Unity's samplers can report GPU times on your given platform.
///
/// This script creates recorders for all of available <see cref="Sampler.GetNames(List{string})"/>.
/// It then runs for a bit and checks which recorders report positive values for <see cref="Recorder.gpuElapsedNanoseconds"/>.
///
/// Note: Only the editor and development builds can use Recorders as of Unity 2020.2
/// </summary>
public class TestGPURecorders : MonoBehaviour {
[Tooltip("Enables the recorder GPU testing.")]
public bool RunTestInStart = true;
[Tooltip("Seconds which will be waited before test before initiating the profilers and allNames lists.")]
[Range(3, 9f)]
public float WaitTimeBeforeInitiation = 3f;
[Tooltip("Seconds which will be waited before running the test for GPU profilers.")]
[Range(1, 30f)]
public float WaitTimeBeforeTest = 5f;
[Tooltip("Runs the test twice...")]
public bool RunTestTwice = true;
[Header("Runtime Test Results")]
[Tooltip("Number of recorders which was retrieved with Sampler.GetNames")]
public int NumRecorders = 0;
[Tooltip("Number of recorders which reported being valid.")]
public int ValidRecorders = 0;
[Tooltip("Number of recorders which have positive GPU times.")]
public int ActiveGPURecorders = 0;
/// <summary>
/// List of names generated from <see cref="Sampler.GetNames(List{string})"/>.
/// </summary>
private List<string> allNames = new List<string>();
/// <summary>
/// List of all recorders generated from <see cref="Sampler.GetNames(List{string})"/>.
/// </summary>
private List<Recorder> allRecorders = new List<Recorder>();
/// <summary>
/// List of the keys for <see cref="GpuRecorders"/>.
/// </summary>
public List<string> GpuNamesActive = new List<string>();
/// <summary>
/// This is a dictionary of all of the actively recording GPU recorders.
/// </summary>
public Dictionary<string, Recorder> GpuRecorders = new Dictionary<string, Recorder>();
private void Start() {
if ( RunTestInStart ) {
StartCoroutine(StartTestsWithDelays());
}
}
/// <summary>
/// Initiates all of the supported recorders.
/// </summary>
public void InitiateAllRecorders() {
// Only run in editor or debug builds.
if ( !( Application.isEditor || Debug.isDebugBuild ) ) return;
allNames.Clear();
allRecorders.Clear();
GpuNamesActive.Clear();
GpuRecorders.Clear();
Sampler.GetNames(allNames);
NumRecorders = allNames.Count;
ValidRecorders = 0;
ActiveGPURecorders = 0;
int lenRecorders = allNames.Count;
for ( int i = 0; i < lenRecorders; i++ ) {
allRecorders.Add(Recorder.Get(allNames[i]));
allRecorders[i].enabled = true;
}
}
/// <summary>
/// Waits a period then runs the test, then logs the results.
/// </summary>
public IEnumerator StartTestsWithDelays() {
// Only run in editor or debug builds.
if ( !( Application.isEditor || Debug.isDebugBuild ) ) yield break;
// Wait for the scene to get going.
yield return new WaitForSecondsRealtime(WaitTimeBeforeInitiation);
// Init the recorder vars.
InitiateAllRecorders();
// Wait for the scene to get going.
for ( int i = 0; i < 2; i++ ) {
yield return new WaitForSecondsRealtime(WaitTimeBeforeTest);
AddNewGPURecordingProfilers();
yield return null;
LogResults();
if ( !RunTestTwice ) yield break;
}
}
/// <summary>
/// Tests which GPU times have positive values.
/// This may be called multiple times outside of <see cref="StartTestsWithDelays"/>.
/// </summary>
public void AddNewGPURecordingProfilers() {
// Determine all of the active gpu recorders
int lenRecorders = allNames.Count;
for ( int i = 0; i < lenRecorders; i++ ) {
if ( allRecorders[i].isValid ) {
ValidRecorders++;
if ( allRecorders[i].gpuElapsedNanoseconds > 0 ) {
if ( !GpuRecorders.ContainsKey(allNames[i]) ) {
GpuNamesActive.Add(allNames[i]);
GpuRecorders[allNames[i]] = allRecorders[i];
ActiveGPURecorders++;
}
}
}
}
}
/// <summary>
/// Lists all of the active gpu recorders.
/// </summary>
public void LogResults() {
Debug.LogFormat("{0} Recorders are reporting positive times for the GPU... Listing <{0}>: ", ActiveGPURecorders);
int countActive = GpuNamesActive.Count;
for ( int i = 0; i < countActive; i++ ) {
Debug.LogFormat("{0}: {1:N5} ms",
GpuNamesActive[i],
GpuRecorders[GpuNamesActive[i]].gpuElapsedNanoseconds * 1e-6);
}
}
}
}
Both. First the Markers need to be hit by code execution, then there’s the frame delay after turning recorders on these on.
But it seems you already got that working
Measuring GPU timings, e.g. via these recorders can have a bit of an impact on the general performance, depending also on the graphics APIs. For more info and also some metrics that are available in Release builds, check out this thread .