I’ve been doing some performance profiling recently between a Samsung Galaxy Note 20 Ultra (Snapdragon 865+ ) and an iPhone XS Max (Apple A12 Bionic).
According to various sites like : Snapdragon 865 vs A12 Bionic: tests and benchmarks, the SD (Snapdragon) has a higher single core performance, as well as higher clock speed. This would lead you to believe that an application should perform with similar performance, however, at least with Unity, this has not been the case.
While testing our game between the two phones, we get about double the performance out of the A12 Bionic vs the SD 865+. (7 ms on A12 vs same frame taking 14+ ms on SD 865+). Based on the benchmarks, you would think the SD 865+ would perform better.
This leads me to believe that something odd is going on. I would expect at least similar results, definitely not a difference this big, and in the opposite direction even still.
It seems very odd that the Android is performing so poorly. Are there some optimizations that are turned on for iOS but not for Android? Is the cpp for Android getting compiled using O3 or similar optimization flags?
I would really appreciate any help here. Thanks.
Which Unity version do you use? Do you use IL2CPP for Android build?
In Unity 2018.1 and above, you can try ‘Master’ C++ compiler configuration in Player Settings:

Hmm, those are both interesting findings. I knew about the first one, and from my testing, there is little to no performance difference between Release and Master (ReleasePlus). I didn’t know about the second one though, but it looks like it is fixed in the Unity version we are using. Also my main concern stems from main thread performance, which would be running on one of the high performance “big” cores.
I have done some small tests and it seems like at least with some raw math(ish) tests (some basic math and hashing tests), the application scales accordingly between the A12 and SD 865+. However, within the game, we don’t see similar performance scaling. Maybe the memory difference between the two chips is substantially different, but from what I could read online, there doesn’t appear to be more than a 20% speed difference, memory wise.
It is just weird because it “feels” like we should be getting better performance, based on what we are doing. And also there are other benchmarks that would support that claim.
When you say performance, do you mean the actual framerate or main thread free time? What does the profiler looks on each device?
If it’s framerate, what do your game look like in terms of graphics and visual effects? Because unless you’re profiling with a disabled camera and no GUI the GPU performance will factor in, and in that area the A12 is an absolute monster.
I am only talking about main thread CPU time. I have already factored out any performance difference from render GPU time (specifically, waiting for render thread or for frame present).
So basically it is
time measured = main thread CPU time - vsync - time spent waiting for GPU or GFX thread
Even if I simplify this further and only factor in any time spent before any rendering occurs, the results are pretty much the same. Twice as slow on the SD 865+ vs the A12.
Is there a chance the Samsung device is throttling?
I don’t see anything to indicate it would be throttling. I’m not maxing it out. Frame time is less than 16 ms with vsync set to 60 fps. And the numbers just make sense such that with a lower load, the time scales to make sense.
In order to find out if this is something with Unity itself or down to hardware/OS differences you’d need to test using something other than Unity as a control. A pure C++ project where you are in full control of compilation and optimization settings, for example, or test if you get the same results with UE4 or Godot.
Yeah that is what I am saying above, is that I did verify this with a (very) small test benchmark of my own, and the results seem to show that the performance is very similar CPU-perf wise between the two processors. But in the regular gameplay, I do not get similar results
More sleuthing has uncovered the following :
- I did another test to better compare raw CPU performance with a hashing test between both devices. A12 results in avg of 18.67 ms, and SD 865+ results in 14.55 ms on avg. SD 865+ was ~35% faster.
- During this test, the CPU looked like the 1st image. Notice the usage of CPU 7 (the high performance core in the SD), and how it is maxed out with the main thread performance (you have to zoom in in the original UI to determine it is the main thread, so just trust me on that).
- I tested the same test from #2, but with a lower amount of data to hash (so should be less time consuming, easier on CPU, etc), and the second image shows what that looks like. Notice how CPU 7 is now missing, my guess is either from a bug, or because no activity was recorded on that CPU core.
- I tested the actual game, and you can what that looks like in the 3rd image here. Notice how there is very light usage on CPU 7. Very odd.
My theory (but it has holes) : Android OS is determining that the game activity is not demanding enough to warrant running on the high performance core, so it moves it to one of the lower medium performance cores (SD 865+ has 4 low performance, 3 medium performance, and 1 high performance core). This does not explain why the test still runs faster on the SD since it would be running on one of the medium performance cores, which I highly doubt are faster than the high performance A12 cores. Even more importantly, this does not seem to explain why the overall game (and to be more correct, I say overall game performance, but specifically I am talking about per frame CPU performance, although per frame scripting performance is very similar in terms of perf % between SD and A12) still performs worse than on the A12. One further test would be for me to test the game with a higher load, and see if that makes it want to use the additional core.
Maybe you can try to set Thread Affinity of main thread (or render thread) to core 7 to see whether there is some performance difference?
The documentation is here: Unity - Manual: Android thread configuration.
Thanks for the link, I tried it and it appears nothing changed. I tried setting both
-platform-android-cpucapacity-threshold 1000
and
-platform-android-unitymain-affinity 0b10000000
and neither of those did anything apparent. Note I set each one separately, not simultaneously. The systrace still looks almost identical, with very little activity on CPU 7.
I put a log in the java code, and it is indeed being called to setup the command line parameters, but for some reason, there is no effect. Maybe I am doing something wrong… or something is broken…
The following method can be used to confirm the thread affinity:
Thanks, it seems that allowed cpus is set to the last 4 cores, even though I used the following command line options :
-platform-android-unitymain-affinity 0x80
-platform-android-unitymain-priority -20
So seems like there is some bug here. I tried
-platform-android-unitymain-affinity 0b10000000 as well but there was no difference

Any news about this issue?
I’m having similar performance differences between an iOS device (iPhone XS Max) and an Android device (Samsung S22).
Certainly bumping this topic. I have checked via AGI that Pixel 6 Pro properly sets core affinity, but nevertheless performance difference is rather huge. Not sure if it’s the only culprit afterall.
Facing the same issue specifically in big.Mid.LITTLE configurations, for example with 1 big, 3 mid and 4 little cores. 4+4 big.LITTLE work as expected.