Creating multiple sampler2D's to optimize shader performance?

I recently read about the performance cost of texture sampling in shaders here. If I understand correctly, textures are sampled by dedicated hardware in the GPU. If you are sampling different textures, this can run in parallel with no additional cost (“sampling 12 textures at once takes only as long as the slowest texture sample”).

Does this mean sampling texture arrays or atlases multiple times is faster when using multiple samplers for each sampling operation? E.g. you have a terrain with 4 layers and triplaner mapping, so 12 sampling operations. If I’m correct, with a 2x2 texture atlas this results in ~12x sampling cost. Now to optimize I use four different sampler2D’s for the terrain layers instead, and three additional sampler2D’s for triplanar mapping. This could reduce the sampling cost to 1x sampling cost, as all 12 sample operations now run in parallel. Is this a common approach and does it have negative sideeffects, e.g. on memory bandwidth?

Thank you for your help! I would also apprechiate any papers, books, etc. where I can find out more about performant shader programming, especially reguarding mobile gpus.

i’m not an expert by any means, but what i’ve inferred from posts i’ve seen on here is that some GPUs will handle having multiple samplers better than reusing, and others will handle it worse, so it’s kind of an un-optimizable area where you should just work with whichever pattern you prefer.

there is some kind of specifics as to the vendors that are better at each, but all i recall is that it’s not as much a difference of gpu model as it is company (ie nvidia vs amd.)

1 Like