Hi everyone,
first of all thanks for taking the time to look at this - we’re thankful for any input!
About the issue we’re facing right now:
We’ve been encountering a random fps-drop to near-zero in our game-app on android for the past months and have so far not been successful in debugging the issue. It happens at full random times (sometimes after 3 minutes, sometimes after 3 hours) and feels like a freeze - but actually the FPS drop very quickly while still running.
We’ve spent quite some time debugging and analysing with various tools already. Now we hope that you can help us and draw new conclusion from our data.
Versions:
Unity version: 2020.3.30f1
URP: 10.8.1
Clues:
- It is not a full freeze. New frames do get rendered if you wait long enough.
- At first the frame times go to 30ms, 50ms, 100ms, 500ms, 1000ms, 3 minutes, 5 minutes, etc. - within a short time period
- Issue has been reproduced on different Android devices (e.g. One Plus 6)
- In the unity profiler we can see, that when the issue occurs the main thread seems to be waiting for “gfx.presentframe” on the render-thread
- There is no actual increase in load (you might idle in the same scene for 30minutes with everything being fine, then suddenly fps drops. We also dont see increased polys, memory usage, anything in the profiler)
- Issue occurs in different scenes with different objects on screen etc
- Issue occurs randomly, sometimes after 5minutes, sometimes after 2h of running the application. It does not appear to be related to devices getting hot or into power-safe mode
- Issue does not require any input/interaction. It just happens while letting the app idle
- In ARM mobile studio graphics analyzer we can see, that once the issue occurs there is a ~3minute (or more) gap between eglGetCurrentContext() and the next command (eglGetSyncAttribKHR ) - however the commands themselves only take a few μs. When looking at earlier frames (before the issue started) there is no notable delay between eglGetCurrentContext() and the next command (eglGetSyncAttribKHR ). The return value of eglGetCurrentContext() is the same before the issue starts and while the issue is happening
Screenshot from the profiler (frame time of 2+ seconds)
What we hope for here:
- We’re incredibly happy for any help or clues
- Does anybody know which kind of profiling tool could tell us what exactly goes wrong?
- What could be the next steps to figure out what’s wrong and how to fix this?
- Has anybody seen anything remotely similar happen before?