Using Unity.Mathematics without Burst much slower than Mathf

Hi, I’ve been using Unity.Mathematics in a non-bursted code in a project (where using Dots is not possible), and did some simple benchmarks using matrices (comparing float4x4.TRS and Matrix4x4.TRS etc) and float4x4 is around 5 times slower than Matrix4x4. float3 seems to be slower than Vector3 as well. So I wonder if I shouldn’t use Unity.Mathematics at all if I’m not going to be able to Burst compile it. Any suggestions on this?

Also why is math so slow when not burst compiled?

I read somewhere that Unity.Mathematics is actually slower and should only be used in conjunction with Burst or IL2CPP.

I’m indeed using IL2CPP. Perhaps I should do some benchmarks on builds.

There are many things that make it slower.

First, the methods you mentioned are actually extern functions called via interop. They are essentially “Burst” compiled (actually natively compiled) already, most likely using SIMD code.
I don’t KNOW this but it may also be the case that when you benchmark a specific function in a loop, the native function may be cached, resulting in interop barrier (~40 assembly instructions) only being called once or twice. (PLS if anyone knows more…)

Secondly and more importantly, float arithmetic in Mono runtime is extremely inefficient; All arithmetic is done in double precision, with scalar conversions back and forth. With a latency of 2 clock cycles of floatdouble and a throughput of 1 per cycle, the conversion of 16 floats takes 16 + 2 clock cycles at least. doublefloat takes 4 cycles, therefore converting back takes 16 + 4 cycles; 38 clock cycles in total. This may already be the latency of the interop barrier alone.

Then there is the fact that SIMD code can multiply/add 4 floats at once, maybe even using fused multiply add instructions.
Additionally, since 16 float registers are used, this may result in register spilling onto the stack and reloads instead of SIMD vector shuffle instructions in register. Depends on the Mono JIT.

If you use IL2CPP, though, the performance will be almost identical I’d assume.

That would apply to both versions (float3 and Vecto3) in mono without burst.
2021.2 uses an updated mono and should now also treat floats internally as float.

If you haven’t tested on builds, you haven’t tested IL2CPP, since the editor always uses Mono.

Mathematics in mono is VERY slow. (in our tests, in editor, it was like 50% slower for super simple stuff)

Mathematics in IL2CPP is ever so slightly faster than Mathf. (in the same above test, in a build, Mathematics was up to 5% faster, maybe less, practically identical, but it was always slightly faster)

Not if the function using the Vector3 is a native DLLImport function.

Cool! Let’s hope 2021 LTS isn’t too far away.

I just took an Android build with IL2CPP and now on the profiler it looks like float4x4 is a lot faster, good reminder to benchmark only on builds

Vector3 has an C++ version, but not using ist on normal operations.
No DLLImport in the reference:
https://github.com/Unity-Technologies/UnityCsReference/blob/master/Runtime/Export/Math/Vector3.cs

I have seen that Unitys Mathf often only uses .NET Math, which makes some calculations back in doubles. But since .NET Standard 2.1 there are a float version MathF. Unity should now use the new math library.

https://discussions.unity.com/t/668199 page-2#post-7492382

The link is about the fact that floats are treated internally as doubles.
UnityEngine.Mathf uses System.Math on some places that only uses doubles. That would not change.