Also, let’s say I have a variable I only need in the fragment shader, and that only needs to be calculated once per mesh, based on a uniform. Is that the right place to calculate it? Or should I do it in the vertex shader? (I can’t seem to find it, but I remember Daniel Brauer posting something about how it might be faster to do that, despite the interpolation.) Or is there actually no way to achieve a once-per-mesh calculation, forcing me to do a calculation on the CPU, and send it over, to achieve greatest rendering performance?
if you need a value only once per mesh, you would normally feed a register with it (exposed variable) from the scripting end.
but yes if you do it in shader, I would do it in a vertex shader. not just due to interpolation or alike but also cause you will never have as many vertices in a mesh as pixels on the texture on a normal mesh
Note that this text is written for OpenGL ES programmers. What it refers to is what dreamora said: you would transform the value of the uniform “in software” on the CPU, i.e. in the main OpenGL ES application, i.e. in C++/C/ObjectiveC/… (or by a script in Unity), and then you would set the uniform of the shader to the transformed value. Thus, the code for the transformation doesn’t appear in the shader.
You might think that the text is not very clear about it, but from the point of view of a OpenGL (ES) programmer it actually is pretty clear.
There is no way to perform calculations once per mesh on the GPU.
As far as I know, shader compilation on any platform will not move calculations between programs. They certainly won’t be moved by the Cg or GLSL compilers, and I doubt any compilation by a driver would move them, either. This means that you have to know the spaces and places across which your data will vary, and position calculations accordingly.
The three levels at which you can perform rendering calculations in Unity are:
Fragment shader - once per fragment
Vertex shader - once per vertex
Script - once per vertex, once per mesh, once per frame, once per second, once per scene, etc.
Note that although the order above is usually from most frequent to least frequent calculation, sometimes there will be more vertices in a mesh than resulting fragments rasterized. So a skinned mesh seen from a distance might actually have more CPU calculations done to display it than fragment shader calculations.
One thing to keep in mind: this describes PowerVR’s shader compiler that they provide for their GPU licensees. However, you never know whether an actual device you have actually uses this shader compiler or a modified shader compiler or something entirely else.
For example, on iOS (the largest market size of PowerVR devices) big parts of shader compiler frontend are written by Apple itself; and only the shader backend seems to be written by PowerVR. I don’t know if Apple’s compiler can extract calculations and compute them on each draw call. I guess the only way to know is to try to measure that.
And of course, all of the above can change with each OS/driver version.