Unexpected performance of `in` parameter modifier with IL2CPP

I wrote an performance testing function about in parameter modifer with large struct:

    [Test, Performance, Category("Performance/Large Struct")]
    public void In_ImmutableLargeStruct_WithInOrNot()
    {
        var unit = SampleUnit.Microsecond;
        var withoutIn = new MeasurementSettings(100, 100, 10000, "Immutable struct without in", unit);
        var withIn = new MeasurementSettings(100, 100, 10000, "Immutable struct with in ", unit);

        var immutableStruct = new ImmutableStruct(1.1, 2.2);

        Measure.Method(() => { StructInBenchmarkTest.Add(immutableStruct); }, withoutIn).Run();
        Measure.Method(() => { StructInBenchmarkTest.AddWithIn(immutableStruct); }, withIn).Run();
    }

ImmutableStruct is a 88-bytes readonly struct.Looks like:

public readonly struct ImmutableStruct
{
    public double x { get; }
    public double y { get; }
    public double z { get; }

    private readonly double m_A;
    // ....

And the Add function looks like:

public static double Add(ImmutableStruct s) { return s.x + s.y; }
public static double AddWithIn(in ImmutableStruct s) { return s.x + s.y; }

While in Unity Editor and android device with mono 32-bit backend, the performance testing result matchs expection that in modifer will lead to performance improvement with immutable structure.

However, while in android device with il2cpp 64-bit backend, there is no noticeable performance effect with in modifier or not:


I tested on Oculus, XiaoMi, and microsoft android subsytem with Unity 2020.3.25f1 / 2022.2.1f1. All of these test doesn’t give expected performance test result.

I’m not sure why the behavior is different, but we would love to have a closer look. Can you provide this information with a bug report? Unity QA: Building quality with passion

Hi,i have submitted the bug, but it’s an internal bug for now? It seems that bug id is IN-31693

Thank you. Once our QA team processes this bug report, then we will have a public link available for it.

You’re probably not seeing a performance difference with IL2CPP because the C++ compilers optimized away the struct copying without the need for the in operator.

But then once again you would not expect Il2cpp (C++) code to run slower than the optimized mono code, especially not by the degree we are seeing here. Of course I’m wondering if the Delegate Function used in the test-code makes any difference for Il2cpp.