Suggestion - compile time branching in burst jobs

Hi! Let me know if this is in the wrong place.

Imagine you’re writing a job that has first some set-up code, then a hot loop. Inside the loop you might want to do either A or B, say call either one method or another.

You have two alternatives: either you accept a branch in your hot loop, or you write two jobs where everything except that inner method call is identical. This is what I have been doing, but maintaining duplicate code is sub-optimal.

Instead, I’d like to be able to tell the burst compiler that I promise it a certain variable won’t ever change while inside the job, that the variable is fixed on schedule time. This would allow the compiler to get rid of branches based on that variable, and the hot loop could run branch-less.

Example:

[BurstCompile]
struct BlurJob : IJobParallelForBatch
{
   [ReadOnly] public float valueA;
   [ReadOnly] public float valueB;
   [ReadOnly] public float valueC;
   [ReadOnly] public NativeArray<float4> pixelsIn;
   [WriteOnly] public NativeArray<float4> pixelsOut;

   enum FilterType { HQ, LQ };
   [SheduleTimeStatic] public FilterType _fType;

   public void Execute(int startIndex, int batchSize) {
      //this set up code is the same for all versions of the job
      float d = valueA + valueB;
      float e = d + valueC;

      //so is this loop
      for (int i = startIndex; i < startIndex + batchSize; i++) {
         float f = e + i;

         //this is different for each of the two versions of the job. The job is
         //re-compiled for each novel value of this, so the condition is never
         //checked in run-time
         switch(_fType){
            case HQ: pixelsOut[i] = HighQualityBlur(pixelsIn[i], f); break;
            case LQ: pixelsOut[i] = LowQualityBlur(pixelsIn[i], f); break;
         }
       }
   }

   [MethodImpl(MethodImplOptions.AggressiveInlining)]
   static float4 HighQualityBlur(float4 in, float something)
   {
   ...
   }

   [MethodImpl(MethodImplOptions.AggressiveInlining)]
   static float4 LowQualityBlur(float4 in, float something)
   {
   ...
   }
}

The compiler would basically create 2 different versions of the above job, where only what happened in the loop would be different. It would be able to do this because the [SheduleTimeStatic] tag promises it that the value won’t change inside the job, so it could treat it as a compile time constant.

As you’re aware, there are a multitude of other cases where the compiler can do a much better job if it knows what a value is at compile-time. Such as more efficient vectorization if it knows the iteration count when looping over an array, and more efficient division/multiplication if it knows that it’s dealing with a power of two.

If the specific hardware it’s compiling for doesn’t support this for any reason, it could just fall back on ignoring the [SheduleTimeStatic] symbol. This would mean checking the condition in run-time, which would be slower but produce an identical result.

Thank you for your time!

2 Likes

Since the value n the example is a constant for the job, could the condition be moved outside the loop? It’ll still be one job, but it will move the jump instructions outside of the hot loop and allow the compiler to vectorize two separate loops.

If you absolutely need compile-time switches, I’ve found using generics works really well for situations like this.

public struct BlurJob<BlurT> : IJob where BlurT : IBlur {

    public void Execute() {
        default(BlurT).DoBlur(...);
    }

}

new BlurJob<HQBlur>().Schedule();
new BlurJob<LQBlur>().Schedule();
2 Likes

@amarcolina 's solution is the best, but if it doesn’t fit neatly into a set of interface methods, you could also do some hack like this (might be some typos, but you get the idea):

internal interface IMyJobMode { }
internal unsafe struct Mode1 : IMyJobMode { fixed byte unused[16];  }
internal unsafe struct Mode2 : IMyJobMode { fixed byte unused[32];  }

internal unsafe struct MyJob<TMode> : IJob where T : unmanaged, IMyJobMode {
    public void Execute() {
        if(sizeof(TMode) == sizeof(Mode1)) {
            /* do thing 1 */ }
        else {
           /* do thing 2 */ } } }

new MyJob<Mode1>().Schedule();
new MyJob<Mode2>().Schedule();
1 Like

Something like this works, but can quickly become unwieldy if it’s many values:

public void Execute () {
    int runTimeValue = ... // 0,1,2,3
    switch (runTimeValue) {
        case 0:
            ActualExecute(0);
            break;
        case 1:
            ActualExecute(1);
            break;
        case 2:
            ActualExecute(2);
            break;
        case 3:
            ActualExecute(3);
            break;
    }
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
void ActualExecute (int COMPILE_TIME) {
    // it's compile time here
}

Hey this might actually do the trick! Thank you, I’ll try it!

So you’re basically using the bit-size of the data type as an enumerator? It’s a super cool idea. Would it actually resolve at compile time though?

This looks like a runtime switch-statement to me?

It is, but the compiler will inline the methods with the hardcoded argument - effectively lifting the conditional out of any of the loops or whatever is going on inside the complicated job and only evaluating it once at the start outside of the method itself (constant propagation). And all the code is still in one place, which is probably a good thing.

I’d like to point out that the other suggestions also seem to be runtime conditionals; the point is just to evaluate them once, outside the loop. You can’t really make a compile time conditional for a runtime switch.

amarcolina’s solution looks to scale a lot better if you’re going to add more optional functionality, so that may be the better solution in general. Has some more boilerplate though if it’s just for one bool, so I figured I’d post this option.