Hello, I think there is a performance issue in Unity5 when run on Android devices.
I build an apk with just an empty scene, and I use Adreno Profiler to see the drawcalls.
Then I find Unity5 first draws all the things into an render texture, then uses a single drawcall to blit the render texture to the back buffer.
This causes the Pixel Fill Rate really high, and this behaviour didn’t exist in Unity4.6 with all mobile platforms or Unity5 with iOS.
I want to know if it’s a bug or not. Such a heavy blit operation is really a great big cost on mobile devices. Is it possible to use the behaviour in Unity4.6?
Thanks
The two snapshots of Adreno Profiler below shows an empty scene built with Unity5 run on an android device
Below shows the stats of gpu while running the app built with Unity5. You can see the FPS is about 30, GPU Clock is about 450MHz, GPU busy is about 11%, and Fragments Shaded per Second is about 60M.
My device has a resolution of 1920x1080, that is about 2M pixels, and since FPS is 30, so the Fragments Shaded per Second is 60M just make sence.
Below shows the stats of gpu while running the app built with Unity4. You can see the FPS is about 30, GPU Clock is about 450MHz, GPU busy is only about 4%, and Fragments Shaded per Second is about 0.4M.
It seems there is really no BLIT in Unity4.
Thanks for your suggestion.
Our current version is 5.1.0f3. I’ll try the latest patch version tomorrow, and I hope the problem has been solved already.
I hope also, but its best to always check on latest version as it saves us time later when debugging and helps make sure Unity devs know if it fix or not…Thanks for the detailed report
Something wrong with my unity, and I can’t submit a bug report.
Could you please do me a favour, submit a bug report to unity and reference to this thread. Thanks a lot.
The additional blit from an FBO to the back buffer was added in Unity 5/Android.
It was done for various reasons (workaround for buggy compositors/allows alpha in frame buffer, workaround for buggy hardware scalers, better/working MSAA switching at runtime, multi-display, better match to Unity scripting APIs like Display, allows read access to backbuffer which is needed for some Unity scripting APIs).
The costs of the blit is between 0.5ms-2ms in our measurements, depending on rendering resolution and GPU. Of course it also costs some power.
We are considering making a “fast path” for cases where we can do without the extra blit, but that probably won’t come with a patch-release.
Yes it probably would be best, given it’s mobile and a vast amount of customers are just making 2D titles for mobile. What if one wants to do their own final blit regardless for their own scaling? We did this for many mobile titles, and in addition do it on a custom mesh for additional special effects.
I would also advocate an advanced mode for all platforms really - wherever we want to take responsibility in exchange for some speed back
0.5-2ms is high on mobile, it’s a screen worth of parallax or non overlapping particles.
Could it not be one of the explainations people are reporting slower performance on Android in Unity 5?
I mean one issue I often see is if you have a Physics update of 0.02, then you have to be under 20ms on android due to the forced vsync in release otherwise you will get 2 fixed updates per-frame, so assuming your Physics and Scripts take 10ms on a slow devices + your 2ms from above (assuming you tested bottom end devices) that leaves less than 8ms for the graphics, if you go over this 8ms, your essentially dropping the framerate to half!
It does seam in Unity 5 to be a lot stricter on hitting the spot that gets smooth performance on low end hardware, I find going over. Also given you are doing this Blit, how come we don’t get the option to render as lower resolutions to make up for the loss (like the performance settings in iOS)
I don’t think you can just add time deltas of stuff that’s running on the CPU (scripts, physics) to stuff that’s running on the GPU (blit).
Obviously an additional copy is not an optimal solution and if you are GPU limited then the blit is wasting 0.5-2ms of your frame time + memory bandwidth and power. But so far I have not seen a case where it would cause even close to 30-50% slowdown as reported in other threads.
The cost of the blit isn’t necessarily cheaper on new devices with faster GPUs because these tend to have very high resolution displays and the mobile GPUs usually get more ALU power and not so much additional bandwidth.
To render at a lower resolution you can always use Screen.SetResolution. Especially on 1440p phones that’s probably always a good idea. We are thinking about adding a GUI option to the PlayerSettings or QualitySettings, maybe depending on the screen DPI, but that’s not decided yet.
But isn’t the Wait for sync done on the CPU side, so if your hitting the point where you miss the sync on the GPU you have to wait for the next one before draw happens, meaning a wait on the CPU thus reducing the framerate dramatically, at least thats what seams to be the case on my test device with a Mali-400, as I get better performance without V-Sync. This device doesn’t support Screen.SetResolution, so you can’t drop the resolution to gain performance.
Though to be fair, is the time taken by the blit operation Present.BlitToCurrentFB? If so thats not the cause of the drop I’m seeing between 4/5 I think it might be the change in batching algorithm.
Afaik “WaitForTargetFPS” (name in Unity Profiler) blocks the Unity main thread.
“Graphics.PresentAndSync” is blit + SwapBuffers and SwapBuffers may block when limited by the GPU or display refresh rate. Other graphics API calls may also block for the same reasons (e.g. Clear).
The wait for the actual vblank is somewhere in the compositor, check Graphics architecture | Android Open Source Project.
The value of Present.BlitToCurrentFB isn’t very helpful because it only measures the wall time for the API calls, no sure why we have it. We measured the cost by comparing the frame times of same (non-empty) scene with and without the blit.
Thanks for your reply. Do you mean in the future version of Unity5, the render pipeline may remove the extra blit if it is not needed? Or will you export a setting to let users to decide whether they need the extra blit? I think the second solution may be easy to implement and more flexible.
So doing some math if the 2ms at worst is correct:
Old FPS MS Req With Blit New FPS
50.0 20.0 22.0 45.5
30.0 33.3 35.3 28.3
20.0 50.0 52.0 19.2
So at the top end of performance it does create quite a drop might be an issue for VR apps needing the high rate, at the bottom end though not too much a problem, if 2ms is accurate.
It would probably be a big problem if you wanted to use RenderTextures for post processing effects though, would be good if you could expose the RT to us before it is output in some way, or even the shader used in the Blit?