Poor performance of synchronization primitives(lock, volatile, interlocked) in IL2CPP

Performance of code using lock, Volatile, Interlocked in IL2CPP is very poor.
It is degraded more than mono.

Here is a simple benchmark.

int count;

public void TestNormal()
{
    [MethodImpl(MethodImplOptions.NoOptimization)]
    static void Incr(ref int v)
    {
        v++;
    }

    for (int i = 0; i < 100000000; i++)
    {
        Incr(ref count);
    }
}

public void TestVolatile()
{
    for (int i = 0; i < 100000000; i++)
    {
        Volatile.Write(ref count, Volatile.Read(ref count) + 1);
    }
}

public void TestInterlocked()
{
    for (int i = 0; i < 100000000; i++)
    {
        Interlocked.Increment(ref count);
    }
}

public void TestLock()
{
    for (int i = 0; i < 100000000; i++)
    {
        lock (this)
        {
            count++;
        }
    }
}

In an IL2CPP build(Unity 6000.0.11f1) under Windows

normal: 114ms
volatile: 1310ms
Interlocked: 990ms
lock: 6872ms

The result is too slow and unusable.

In mono

normal: 142ms
volatile: 143ms
Interlocked: 347ms
lock: 1782ms

These numbers make sense.

It seems that there was a report of a slow lock in IL2CPP in 2020 as well.

But it means that nothing has improved.

For reference, here are the benchmark results in .NET 8.

Method Mean
TestNormal 23.43 ms
TestVolatile 21.70 ms
TestInterlocked 343.34 ms
TestLock 1,305.52 ms
2 Likes

I have created an issue to track this here UUM-77985

Note, there was a regression in Unity 6 related to Interlocked on amd64 platforms. I just fixed this yesterday. It brings performance roughly in line with Mono on Interlocked. That said, I expect we can do better on these primitives with additional work.

1 Like