I have noticed that, in some cases, I get worse performance when performing an operation using a function call than when doing the operation without any function calls. Here is the code used to demonstrate this:
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
public class Test : MonoBehaviour {
// Use this for initialization
void Start () {
Debug.Log ("Testing math function call 10 times");
for (int i = 0; i < 10; i++) {
Test_FunctionCall ();
}
Debug.Log ("Testing NO function call 10 times");
for (int i = 0; i < 10; i++) {
Test_NoFuncCall ();
}
}
void Test_FunctionCall () {
float then = Time.realtimeSinceStartup;
for (int x = 0; x < 100; x++) {
for (int y = 0; y < 100; y++) {
for (int z = 0; z < 100; z++) {
int result = DoMath (x, y, z);
}
}
}
float now = Time.realtimeSinceStartup;
Debug.Log ("Did math 1000000 times in only: " + (now - then) * 1000 + "ms");
}
void Test_NoFuncCall () {
float then = Time.realtimeSinceStartup;
for (int x = 0; x < 100; x++) {
for (int y = 0; y < 100; y++) {
for (int z = 0; z < 100; z++) {
int result = x * y + (z << 2);
}
}
}
float now = Time.realtimeSinceStartup;
Debug.Log ("Did math 1000000 times in only: " + (now - then) * 1000 + "ms");
}
int DoMath (int x, int y, int z) {
return x * y + (z << 2);
}
}
On average, calling the DoMath function 1000000 times takes roughly 55ms to complete. Meanwhile, not calling the calling the function takes roughly 23ms to complete, saving about 30ms.
Whatâs weird about this is that the compiler usually performs trivial optimizations (such as changing x * 2 to a bitwise operation), but even though this feels like a trivial optimization, the compiler isnât optimizing it. I know I shouldnât make a big deal about microscopic optimizations, but a difference of a few microseconds becomes significant on an operation that is being done a million times.
So I have two question: why is the first test slower than the second and is there any way to tell the compiler to take the âx * y + (z << 2)â operation outside of the DoMath function to get the same performance that I got in the second test?