Hey,
I’ve been recently playing around with Unity’s performance analysis tools and decided to check out how well il2cpp handles generic structs. Therefore I made this small test:
public class CodeInjectionViaGenerics
{
private const int NUM_ITERATIONS = 500_000;
public interface ICompute
{
float Compute(ref S s);
}
public struct Computer : ICompute
{
public float Compute(ref S s) => 2*s.x;
}
public static class StaticComputer
{
public static float Compute(ref S s) => 2*s.x;
}
public struct S { public float x; }
public class A
{
public S s;
public float Compute<T>() where T : struct, ICompute =>
default(T).Compute(ref s);
}
public class B
{
public S s;
public float Compute() => StaticComputer.Compute(ref s);
}
[Test, Performance]
public void ProfileA()
{
InitState(102745);
var data = new A[NUM_ITERATIONS];
for (int i = 0; i < NUM_ITERATIONS; i++)
{
data[i] = new A { s = new S { x = Range(0, 1) } };
}
Measure.Method(() =>
{
double sum = 0;
for (int i = 0; i < NUM_ITERATIONS; i++)
{
sum += data[i].Compute<Computer>();
}
})
.MeasurementCount(100)
.IterationsPerMeasurement(100)
.WarmupCount(2000)
.Run();
}
[Test, Performance]
public void ProfileB()
{
InitState(102745);
var data = new B[NUM_ITERATIONS];
for (int i = 0; i < NUM_ITERATIONS; i++)
{
data[i] = new B { s = new S { x = Range(0, 1) } };
}
Measure.Method(() =>
{
double sum = 0;
for (int i = 0; i < NUM_ITERATIONS; i++)
{
sum += data[i].Compute();
}
})
.MeasurementCount(100)
.IterationsPerMeasurement(100)
.WarmupCount(2000)
.Run();
}
}
The main idea is to check how viable it is to inject functionality in a class using generic arguments. This is a well known trick in C#. The results I got where quite surprising. I profiled my code in Unity 2020.20f1, but these results should be the same in 2020.1 and 2019 LTS.
So basically here they are (I report only the median, but if somebody is interested I can also report the other values, but there is nothing interesting going on there):
In the editor - Release Mode: ProfileA → 2.47ms, ProfileB → 2.52ms
Windows Standalone Il2cpp Release: ProfileA → 1.94ms , ProfileB → 0.30ms
Windows Standalone Il2cpp Master: ProfileA → 0.26ms , ProfileB → 0.28ms
This was definitely not what I was expecting - there is something really funky happening with Il2cpp Release. Therefore I looked around, read the generated Il2cpp code and realized that if I add [MethodImpl(MethodImplOptions.AggressiveInlining)] to A::Compute, then everything will work as expected and both implementations will be as fast, well approximately as fast.
Windows Standalone Il2cpp Release with AggressiveInlining: ProfileA → 0.28ms , ProfileB → 0.26ms
I definitely know the Unity devs are aware of this issue as I got the AggressiveInlining idea from looking at their generic code and I kind of understand how this situation came to be. Also, I do understand that this is a very simplified measurement setup, but it exemplifies this problem very clearly.
So now here’s a couple of questions that maybe somebody could answer:
- Is anybody aware of any other issues regarding generics and il2cpp?
- Are there any other ways of massaging il2pp to get improve generics support?
- In case any Unity devs read this - is this something that might be improved in the future?
I am planning on writing an animation library that relies quite heavily on generics, structs and interfaces. If somebody has any additional tips, I would really appreciate it.
Have fun!