Is the pow function “always” slower than repeated multiplication?
Will there be a point where x gets so large that using pow would actually be more beneficial?

For the above examples, the loop is faster, but that’s because it’s equivalent to pow(n, 256) and not pow(n, 512).

However, let’s reverse the question and ask “is the loop ever faster than the pow() function?” The answer is no, it is never faster for hard coded powers set in the shader. At best they are equivalent, even to the point of potentially compiling to identical shaders. At worst the loop is slower. Basically above powers of 512, the pow() always wins.

If the power is set from a material property, or the power is high enough, the compiled shader uses a special way of calculating a value to power any power that takes 8~9 cycles (depending on hardware) no matter what. In this case, the loop could conceivably be faster, but only up to 4 iterations (powers below 16) as dynamic loops have their own cost.