Rendering latency control, GL.Finish() and VSync

dave_m_moore · July 31, 2023, 8:28am

Hi All,
I’m looking at the practicalities of developing a low latency Unity application for a time critical purpose. My normal approach would be to implement something with C++/OpenGL which provides a lot of control, but am now considering whether I can do something equivalent using Unity.

In C++/OpenGL, to avoid tearing, VSync must be on and to avoid frames building up between the CPU and GPU, I would use glFinish() immediately after the glutSwapBuffers() command. This provides a low latency application.

So, how would I do this in Unity? I note that there is a GL.Flush() command, but this won’t do the same as glFinish(). Therefore, my question is, how I suspend the rendering of frames to avoid graphics command build up between the CPU and GPU in Unity? Is there a function that will block the call thread until the current frame is rendered?

I appreciate that I can set maxQueuedFrames to 1 but this is not sufficient.

Thanks for any pointers,
Dave

Unifikation · July 31, 2023, 8:40am

From my experiences with Unity… I’d recommend you do this natively and not use this engine for this kind of objective.

dave_m_moore · July 31, 2023, 9:02am

Yeah, that has always been my point of view too. Unity is a brilliant tool for lots of stuff but I am minded to think that when it comes to this type of application, it doesn’t really have the right hooks or right level of control. Someone please tell me I’m wrong though!

Dave

CodeSmile · July 31, 2023, 9:23am

Please define what you mean by „low latency“.
I have never heard of that graphics command build up that you mention. The engine and its user generate these commands for a single frame only, then they are executed and the next frame starts with a blank slate. Unless of course command generation and rendering can be on separate threads. But even so, the GPU or the user cannot simply at any one point decide „it‘s enough, render now!“ (exceptions may exist in the form of adaptive quality).

Like, if an app has to run at 120 or even 240 fps and thus has 1/120th or 1/240th of a second time to perform all tasks for any given frame then that is absolutely possible with Unity and most other game engines for that matter.It really only boils down to optimization and to make informed decisions as to how much time can be spent on what parts of the app. Generally speaking: the fewer polys you push the more of them you can render.

If you mean player input latency you should look into New Input Manager to see if they mention anything regarding latency and what you can do, if anything, to improve it or ensure it‘s as fast as possible.

dave_m_moore · July 31, 2023, 9:46am

CodeSmile:

Please define what you mean by „low latency“.
I have never heard of that graphics command build up that you mention. The engine and its user generate these commands for a single frame only, then they are executed and the next frame starts with a blank slate. Unless of course command generation and rendering can be on separate threads. But even so, the GPU or the user cannot simply at any one point decide „it‘s enough, render now!“ (exceptions may exist in the form of adaptive quality).

Like, if an app has to run at 120 or even 240 fps and thus has 1/120th or 1/240th of a second time to perform all tasks for any given frame then that is absolutely possible with Unity and most other game engines for that matter.It really only boils down to optimization and to make informed decisions as to how much time can be spent on what parts of the app. Generally speaking: the fewer polys you push the more of them you can render.

If you mean player input latency you should look into New Input Manager to see if they mention anything regarding latency and what you can do, if anything, to improve it or ensure it‘s as fast as possible.

Hi. Thanks for your message. You are right that ‘latency’ can be defined in many ways and sometimes is difficult to quantify/measure. My application takes data from a serial input device and uses this to compose a view within a Unity app. The speed with which the software is able to present an updated view in response to data from this input serial device is critical.
In my experience, most rendering systems consist of a CPU that issues graphics commands into a buffer to be read by a GPU that actions the commands to create a video frame which is consequently displayed on your chosen display/monitor. One serious source of latency is the undesirable build up of commands in the queue between the CPU and GPU. This typically happens when the CPU can run faster than the GPU can execute the commands. In situations where you need vertical sync enabled (let’s say for a 60Hz display device), the rate of the GPU is limited (to 60Hz) and therefore it can be very easy for this back log of commands to happen. As a result, the data used to generate the current displayed frame can be several frames old and this exhibits itself as a latency in the application.
The best way to fix this is to get the GPU to block the CPU until the GPU complete execution of the current frame commands. This way, you can guarantee that the data used to create the current view is relatively new. Using a native application with say C++ and OpenGL, you can use glFinish() to achieve this. So, my question is: how do you achieve this with Unity?

Thanks for any pointers,
Dave

CodeSmile · July 31, 2023, 11:18am

Thanks for the clarification. Realtime hardware monitoring essentially.
From your description that does sound an awful lot like normal operations for a game engine however. Or the other way around: only a flawed architecture (or one not built for the purpose) would allow this buildup of gpu commands to happen.

Or maybe you mean the issue where the app is supposed to render 60 fps but the GPU can‘t render everything in that time and therefore the framerate drops to 30 or 20? But that can only be prevented by ensuring the code is well optimized and the GPU never needs to do anything close to its limits for any given frame.

Naturally - the CPU wouldn‘t simply continue to create or provide more data/commands for the GPU while it is still rendering. At least this never happened in any realtime 3d game engine I‘ve worked with. The CPU doesn‘t simply start creating new commands with every Vsync completion regardless of whether the GPU has finished drawing or not. It may have these commands at the ready for the next time the GPU is ready to receive commands (said queuing) but it should never add more commands for frames even further ahead to the queue while its waiting for the GPU. Because that would either kill the framerate or cause graphics glitches because you can‘t just draw he same object in multiple locations in the same frame.

So I‘m really not sure if maybe I‘m missing the point or forget to consider something due to lack of this kind of realtime hardware programming. But it really sounds like a non issue.

Except … maybe you haven‘t considered that Unity is practically main thread alone (setting DOTS aside for the moment). That means if you receive serial input data on a second thread at 500 Hz you couldn‘t just change the position of a gameobject. You can prepare that data and have it ready for the gameview update script to process it and position objects. If the Update loop runs at 120 Hz you would have the most current data presented visually but you‘d also have skipped over some data updates as the data comes in faster than the rendering occurs. Which isn‘t an issue really. You could get a 240 Hz monitor or disable vsync and you may even get to 500 Hz of visual updates in theory except even on the 240 Hz monitor it will only show about half of these updates because there is simply no visual refresh occuring by the monitor for roughly every other serial data input update (500 vs 240 Hz).

You do have options to multihread unity including object positions using DOTS (Entities).

dave_m_moore · July 31, 2023, 12:02pm

Thanks for the post. However, I believe there are a few assumptions in your response which I believe not to be true.

Yes, I believe it would. That is exactly the way it works. And it would do so until the CPU-GPU buffer was full and then block feeding the buffer. The CPU and GPU run in parallel exchanging data - they are not synchronized.

This is why there is a QualitySettings.maxQueuedFrames parameter to help control the magnitude of it. Graphics drivers can queue up frames to be rendered. When CPU has much less work to do than the graphics card, is it possible for this queue to become quite large. In those cases, user’s input will “lag behind” what is on the screen.

Yes, it would. That’s exactly what would happen, until the buffer was full.

Obviously, if anyone disagrees with this view, please let me know.

Regards,
Dave

CodeSmile · July 31, 2023, 12:35pm

Generally, if the GPU is rendering or displaying frame #100 then the CPU is already working on frame #101. Queue of 1. With 2 it would be able to work on frame #102, increasing the input lag like you said. But a queue length of 1 is always assumed, the CPU is always working ahead of time.

I think what you’re looking for is essentially a queue length of 0 or a synchronized CPU=>GPU pipeline? Is that it?
Not sure if this is possible even with custom engines respectively I wonder if this really decreases the latency with VSync enabled.

Let’s say a Unity app runs at a stable 240 Hz. The input lag will be 1/240th at most.

If a custom engine does the same, but the CPU processed the data in 1/500th of a second and instantly passes the commands to the GPU which itself finishes super fast in 1/10000th of a second, the GPU would still have to wait for the Vsync before it can flip framebuffers and present the newly rendered frame. Therefore the input lag would still be 1/240th for the visual updates (the processing however happens faster, and that may be meaningful in ways I cannot comprehend).

Tautvydas-Zilys · July 31, 2023, 4:06pm

That is exactly what the mentioned API, QualitySettings.maxQueuedFrames, does. Setting that to 1 should achieve what you mention in the original post: it will wait for the frame to be fully displayed before starting the next one (the wait will before sampling input). Can you explain why that is not sufficient?

dave_m_moore · August 1, 2023, 12:00pm

Hi. Thanks for the post. If that’s the way that maxQueuedFrames actually works, then it might do the job. I’ll investigate.
Regards,
Dave

c0d3_m0nk3y · August 1, 2023, 1:44pm

Since QualitySettings.maxQueuedFrames is only implemented on DX11 and DX12, you can also wait on the result of AsyncGPUReadback which was submitted in the last frame to achieve the same thing on other platforms.

By the way, here is a very good explanation about system latency:

axiwam · January 9, 2024, 8:32am

Hi! Did it work in the end?
I also want to ensure I can get the result of current Tick, but I am wondering if QualitySettings.maxQueuedFrames just wait for displaying the newly rendered but not for executing commands of this Tick:(
Thank you!

Unifikation · January 10, 2024, 4:04am

If you need/want/desire tick/time accuracy, use another engine. This isn’t it.

axiwam · January 10, 2024, 5:40am

Thank you!
Yeah, maybe you are correct. Although I cannot change the engine for now, I decided to use camera.Render() instead.

Tautvydas-Zilys · January 10, 2024, 7:58am

What are you trying to accomplish? For rendering videos, we have https://docs.unity3d.com/ScriptReference/Time-captureDeltaTime.html which ensures that frames get simulated with an exact frame rate but that makes it non-real time.

axiwam · January 11, 2024, 5:40am

Hi. I wanted to find something that can stall the CPU until GPU finishes rendering the results of commands of this Tick. Now I use readpixels on the render target of my camera to achieve this (I don’t know if this really works though Thank you for the information and I will check!

Tautvydas-Zilys · January 12, 2024, 3:03am

Setting maxQueuedFrames to 1 will do that at the beginning of each frame. Do you need to do this in the middle of the frame?

axiwam · January 12, 2024, 5:23am

Oh nice! Do you mean that after I call yield return new WaitForEndOfFrame(); it will be fine? Thank you!!

Tautvydas-Zilys · January 12, 2024, 6:12am

No, it will stall it at the beginning of the frame. After end of frame. It is this part of the player loop: https://docs.unity3d.com/ScriptReference/PlayerLoop.TimeUpdate.html

axiwam · January 15, 2024, 5:46am

Thank you!!

Topic		Replies	Views
Application.targetFrameRat It doesn't work in versión 2020.2.0b5.3233 Unity Engine 2020-2-beta	150	27387	January 27, 2023
Is there a way to get very low input latency with Unity? Unity Engine Graphics	6	3070	October 1, 2020
Speed of rendering/Painting News & General Discussion	20	2035	February 26, 2020
On Demand Rendering new in a7 Unity Engine 2019-3-beta	30	8819	March 29, 2021
Battery-preserving low frame-rate but with immediate new frame upon input Unity Engine Input	67	19450	August 24, 2019

Rendering latency control, GL.Finish() and VSync

Related topics