Are all operations performed in a shader done in the GPU?
Does it depend on whether the device has a compute shader?
Sorry if this question seems too elementary, I am getting confused about the terminology. I read that shader code runs directly on GPU, but then I heard about graphics shaders and compute shaders which do not perform the same kind of computation.
To put it simply: if I use Shader Graph and if for instance I add two floats, or if I mix two colors using the Lerp node, will those operations be done in the GPU?
In brief, yes. Any operation that takes place within a shader happens on the GPU. The difference between graphics shaders and compute shaders is just that graphics shaders are for defining the instructions for manipulating vertices and rendering fragments of a mesh and compute shaders are for more general computations performed on the GPU (e.g. for games with 1000s of NPCs you might want to offload some calculations onto the GPU to save on CPU performance).
There is data that the CPU needs to send to shaders: e.g. the CPU sends render state such as parameters used in a graphics shader like shader parameters when those are changed, and for compute shaders the CPU needs to send any input the compute shader requires and, to make the compute shader actually useful, you do need to read back this data on the CPU most of the time (thing buoyancy simulations, that’s a graphics shader manipulating the vertices but the same principle applies where the CPU needs to read water heights to determine how deep an object is “underwater” for those kinds of simulations).
tl;dr all instructions are performed on the GPU (edit: see Ben Cloward’s answer for exceptions to this), but the CPU will often need to send data to the shader and may sometimes need to read back data from the shader, depending on the situation.
For the most part, yes. But there are some exceptions. The shader compiler will try to optimize your shader as much as possible, so it will try to “precompute” some parts of it if it can. So for example. If you add two constant values together - like 3 + 2, the compiler will know that the answer to that isn’t going to change and it will will take that math and do it once on the CPU and then just store the result and use it instead of re-computing it for every pixel every frame.
Many thanks for your answers, that helps a lot!
So for the case of using Shader Graph to write a shader, I understand that, for the sake of performance, one should seek for setting exposed variables (those in the blackboard) from script as little as possible because this is the CPU sending data to the GPU. Is this correct?
worrying about optimizing shaders may be the wrong approach here. what you are trying to do? if don’t push the limit of the GPU and you need each ms of processing power, then worrying about how many variables the shaders has is waste of time.
you can check how unity loop works at this page
for how the GPU works I suggest finding information on directx and vulkan
this is just a quick link I’ve found, but don’t expect a fast understanding of these topics
Thanks for the links!
On update, I am blending (using Lerp) between 20 different 4k textures resulting in 10 textures. I thought I could do it different ways:
- By exposing the 10 textures. On each update:
- In script: compute the lerp amount (in [0, 1])
- In script: compute the 10 resulting textures which are linear interpolations of the 20 textures
- In script: set the 10 exposed textures that my shader uses.
- By just exposing the Lerp amount. On each update:
- In script: compute the lerp amount (in [0, 1])
- In script: set the exposed lerp amount to the shader
- In shader: compute the 10 resulting textures in Lerp nodes that have the 20 textures as inputs.
In general, operations can be ordered from most expensive to least expensive like this:
- Most expensive - pixel shader operations - need to be computed for every pixel on every frame.
- Vertex shader operations - computed for every vertex for every frame
- CPU operations - computed once per frame
- CPU operations - computed once on load or once at install time.
- Cheapest - Offline operations - computed once by the developer
Obviously, you want to try to do as much as you can as #5 - offline. So in your case, if you can blend the 20 textures together in Photoshop and just save out 10 blended textures to use in the project, that would be cheapest. But if there’s something dynamic about the textures that’s changing, then it starts getting more expensive. Can you do the blending once during loading? Or do you need the textures to blend in the pixel shader every pixel, every frame? The answer depends on what about the data is changing and how often it changes.
When you say “On update…” does that mean when the frame updates, or does that mean when the user does something to cause a change? If it’s based on a change the user makes, you probably want to figure out a way to do this just once at that point and save out the results for use rather than doing it every pixel every frame.
Thanks for the answer, it’s really helpful!
I meant on frame update indeed. You can see my use case as a day-night cycle or a sky getting cloudy, then sunny, then stormy, etc. The textures change continuously, undependently from the player’s behavior.
Example: if I want the sky to go from sunny to cloudy in 10 seconds, I will have both sunny and cloudy textures in my shader graph and, on frame update, I call Material.SetFloat to have the exposed variable LerpAmount going from 0 to 1 in 10 seconds.
While trying to pre-compute the most offline, I can create “blended” textures that correspond to several snapshots of the process (LerpAmount = 0, LerpAmount = 0.02, etc.). But, to get a smooth effect, this would yield a huge number of blended textures wouldn’t it? At most, at 60FPS, I should get around 600 textures for 10s of animation. Depending on the size of the textures I guess it is quite unrealistic in terms of memory. But this is just my way of thinking this computed offline, one might have better ideas?
For cases like this where you want things to change in real-time, the best way to go about it is what you’re already doing - using a shader to blend the results. It’s possible you could make things more efficient by using math to generate some of the images procedurally rather than sampling textures. But in general, the method you’re using seems appropriate for the application.
for your exemple you shouldn’t pre-compute anything. Just using real time lerp between two textures is perfectly fine on a GPU.
As a rule of thumb, the less memory you manipulate, the faster you run. ( so don’t pre-compute large textures just for a texture blend )
Also do not worry about doing a material.SetFloat per frame, it won’t be a bottleneck for sure.