Loop alignment in burst?

Was reading an interesting article regarding Loop Alignment in .Net 6. This seemed like a fascinating optimization topic, and I thought of Burst right away! With the addition of new Burst hints like Likely/Unlikely, might it make sense for additional loop alignment hints? Is it possible that Burst/LLVM takes care of this already? I don’t think I’ve ever noticed NOP padding in generated assembly.

Could imagine syntax like this potentially

void Execute() {
    //stuff

    Burst.CompilerServices.Hint.AlignLoop();
    for(int i=0; i<array.Length; i++) {
        //loop
    }
}
2 Likes

I read generated X86 a lot when working with Burst, also often containing small loops.

When reaching such a loop, I often see this:

    .p2align    4, 0x90

.LBB0_10:
    vpaddb    ymm6, ymm6, ymmword ptr [rax + rdx]
    vpaddb    ymm5, ymm5, ymmword ptr [rax + rdx + 32]
    vpaddb    ymm4, ymm4, ymmword ptr [rax + rdx + 64]
    vpaddb    ymm3, ymm3, ymmword ptr [rax + rdx + 96]
    sub    rdx, -128
    cmp    edx, 32640
    jne    .LBB0_10

Your post interests me but I was not aware of such an optimization, although it makes a lot of sense looking back. Now I googled the .p2align and the first result at https://stackoverflow.com/questions/21546946/what-does-p2align-do-in-asm-code suggests that LLVM actually performs that optimization already. Hope it helps :wink:

3 Likes

Ah very cool! I did not know about the .p2align directive, sounds like this is indeed already working just as I had hoped. Thanks for the info!

That really was a great article though - so thanks for sharing! We’ve taken a note of it incase there is anything more we can do (like the hint you suggested) to make LLVM optimize the code even better.

4 Likes