Hey, I tried compiling your code with least possible modifications and on my end it looks like the loop is vectorized properly in most cases. I’m pretty bad at reading assembly though so don’t ask me exactly what’s happening. I just look for them pretty pink asm ops.
[BurstCompile]
public static class VectorizationTest
{
[BurstCompile]
public static void TestLoopVec([NoAlias] ref UnsafeList<int> dataA, [NoAlias] ref UnsafeList<int> dataOut)
{
for (int i = 0; i < dataA.Length; i++)
{
Unity.Burst.CompilerServices.Loop.ExpectVectorized();
int a = dataA[i];
int res = a + 1;
dataOut[i] = res;
}
}
}
Compiles to this:
Assembly Screenshot
(Yes, you can compile a standalone function with Burst, although there’s some constraints, hence the UnsafeList instead of a NativeArray.) A complete job struct looks more like this:
[BurstCompile]
struct VectorizationTestJob : IJob
{
public NativeArray<int> Data1;
public NativeArray<int> Data2;
void IJob.Execute()
=> TestLoopVec(Data1, Data2);
static void TestLoopVec([NoAlias] NativeArray<int> dataA, [NoAlias] NativeArray<int> dataOut)
{
for (int i = 0; i < dataA.Length; i++)
{
Unity.Burst.CompilerServices.Loop.ExpectVectorized();
int a = dataA[i];
int res = a + 1;
dataOut[i] = res;
}
}
}
Compiles like this:
Assembly Screenshot
Note that in both of these cases the loop doesn’t know what the size of the array is. In your original example the compiler can theoretically deduce the array to be of specific size. You’re creating a NativeArray struct with a specific (and very short) constant length, and your TestLoopVec function gets inlined and simplified based on that hint. I’m guessing that this is influencing the compilation result.
Back to your code, I put it back together like this:
[BurstCompile]
struct VectorizationTestOriginalJob : IJob
{
public NativeArray<int> Data1;
public NativeArray<int> Data2;
void IJob.Execute()
{
var inputArray = new NativeArray<int>(4,Allocator.Temp);
inputArray[0] = 0;
inputArray[1] = 1;
inputArray[2] = 2;
inputArray[3] = 3;
var outputArray = new NativeArray<int>(inputArray.Length,Allocator.Temp);
TestLoopVec(inputArray, outputArray);
}
static void TestLoopVec([NoAlias] NativeArray<int> dataA, [NoAlias] NativeArray<int> dataOut)
{
for (int i = 0; i < dataA.Length; i++)
{
Unity.Burst.CompilerServices.Loop.ExpectVectorized();
int a = dataA[i];
int res = a + 1;
dataOut[i] = res;
}
}
}
It compiles to this (skipping over some unimportant bits in the burst.initialize
and burst.initialize.externals
sections):
Assembly Screenshot
Technically there’s less vectorized instructions. However, I’m not getting any errors from Loop.ExpectVectorized()
, and the diagnostics tab doesn’t warn me about Burst being unable to vectorize the loop. Let’s see what happens when we change the setup code to this:
void IJob.Execute()
{
var inputArray = new NativeArray<int>(8192, Allocator.Temp);
for (int i = 0; i < 8192; ++i)
inputArray[i] = i;
var outputArray = new NativeArray<int>(inputArray.Length,Allocator.Temp);
TestLoopVec(inputArray, outputArray);
}
Assembly Screenshot
Not gonna lie, looks like latin to me. But it’s in pink and starts with a v
so I’m happy. This looks much closer to the vectorized assembly above. Basically, the compiler apparently decided it’s not worth it to emit all this code for looping over an array that contains just 4 items. Perhaps someone more well-versed in pink vlatin can offer additional explanation.