So, surface shaders have been discontinued, and Unity has shown no interest in writing a system to abstract shaders between render pipelines unless they go through a shader graph. While I could rant for hours about how short sighted this is, this post is about how to make the current situation for hand writing shaders better.
Ideally what we want is to be able to write shader code which is:
- portable across pipelines (URP/LWRP, HDRP, Standard)
- Upgradable across changes unity makes (SRP/Unity versions, new lighting features)
- Abstracts complexity of managing all the passes required
- Is reasonably well optimized
- Hides the code we don’t care about
I recently finished writing an adapter for both URP (aka LWRP) and HDRP, allowing my product MicroSplat to compile its shaders for all three pipelines. MicroSplat generates shader code, similar to a shader graph, and was modified to support an interface so that each pipeline could decide how to write that code. I wrote the URP adapter first, and since it was similar to the standard pipeline it was relatively easy to understand what was needed. What the adapter does is:
-
Write out the bones of the shader (properties, etc)
-
For each pass
-
Write out the pass header, includes, pragams and such
-
Write out macros and functions needed to abstract the differences between URP and standard, such as the WorldNormalVector function, or defining _WorldSpaceLightPos0 as _MainLightPosition
-
Write out my code and functions in surface shader format
-
Write out the URP code for the vertex/pixel functions
-
This code packs it’s data into the structs that I use and then calls my code
With this, my surface shader code runs in URP. Given that shaders do not actually have structures, all this copying of data into new structures essentially compiles out, making it equivalently efficient in most cases.
I tried to take this same approach porting to HDRP, and after many false starts came to the conclusion that actually understanding the HDRP code in this manner would not only be extremely difficult, but make it a compatibility nightmare when changes occurred. I wanted something that could be easily updated when new versions of the HDRP changed things, so instead went with another approach.
Rather than writing out all of the passes and such, I’d export a shader from Unity’s shader graph which contains various insertion points for my code. The same basic issues arise- I need to add functions and macro’s to reroute missing surface shader functions and conventions, like:
#define UNITY_DECLARE_TEX2D(name) TEXTURE2D(name);
#define UNITY_SAMPLE_TEX2D_SAMPLER(tex, samp, coord) SAMPLE_TEXTURE2D(tex, sampler_##samp, coord)
#define UnityObjectToWorldNormal(normal) mul(GetObjectToWorldMatrix(), normal)
Then copy their structs to mine:
Input DescToInput(SurfaceDescriptionInputs IN)
{
Input s = (Input)0;
s.TBN = float3x3(IN.WorldSpaceTangent, IN.WorldSpaceBiTangent, IN.WorldSpaceNormal);
s.worldNormal = IN.WorldSpaceNormal;
s.worldPos = IN.WorldSpacePosition;
s.viewDir = IN.TangentSpaceViewDirection;
s.uv_Control0 = IN.uv0.xy;
return s;
}
And in each pass on the template, call my function with that data.
While doing this I learned a lot about the internal way Unity is abstracting these problems in HDRP and allowing them to write less of the code in each shader graph shader. I think this approach could be improved to make hand written HDRP shaders much easier to write. With a bit more work, you could have a .surfshader file type which uses a scriptable importer to inject the code inside of it into a templated shader like the one I use and essentially have a large chunk of what surface shaders provide. Further, if LWRP were to follow the same standards, then porting from one pipeline to the other could also be automatic. To understand this, let’s look at how and HDRP shader graphs code is written:
A series of defines are used to enable/disable things needed from the mesh:
#define ATTRIBUTES_NEED_TEXCOORD0
Then code in a shader vertex shader can use this define to filter what attributes are used in the appdata structure. The same trick is used for things needed in the pixel shader:
#define VARYINGS_NEED_TANGENT_TO_WORLD
Then any code which works with these things can check these defines to see if they exist.
This allows them to abstract many internals and determine if various chunks of code need to be run. However, the graph does not take this code far enough if you ask me. For instance, the graph writes out various packing functions to pack data between the vertex and pixel shader, which if this convention was fully followed could be entirely #included instead of written. It also writes functions to compute commonly needed things in the pixel shader, such as the tangent to world matrix, but writes these functions out each time instead of relying on the defines to do the filtering. For instance:
SurfaceDescriptionInputs FragInputsToSurfaceDescriptionInputs(FragInputs input, float3 viewWS)
{
SurfaceDescriptionInputs output;
ZERO_INITIALIZE(SurfaceDescriptionInputs, output);
output.WorldSpaceNormal = normalize(input.tangentToWorld[2].xyz);
// output.ObjectSpaceNormal = mul(output.WorldSpaceNormal, (float3x3) UNITY_MATRIX_M); // transposed multiplication by inverse matrix to handle normal scale
// output.ViewSpaceNormal = mul(output.WorldSpaceNormal, (float3x3) UNITY_MATRIX_I_V); // transposed multiplication by inverse matrix to handle normal scale
output.TangentSpaceNormal = float3(0.0f, 0.0f, 1.0f);
output.WorldSpaceTangent = input.tangentToWorld[0].xyz;
// output.ObjectSpaceTangent = TransformWorldToObjectDir(output.WorldSpaceTangent);
// output.ViewSpaceTangent = TransformWorldToViewDir(output.WorldSpaceTangent);
// output.TangentSpaceTangent = float3(1.0f, 0.0f, 0.0f);
output.WorldSpaceBiTangent = input.tangentToWorld[1].xyz;
// output.ObjectSpaceBiTangent = TransformWorldToObjectDir(output.WorldSpaceBiTangent);
// output.ViewSpaceBiTangent = TransformWorldToViewDir(output.WorldSpaceBiTangent);
// output.TangentSpaceBiTangent = float3(0.0f, 1.0f, 0.0f);
output.WorldSpaceViewDirection = normalize(viewWS);
// output.ObjectSpaceViewDirection = TransformWorldToObjectDir(output.WorldSpaceViewDirection);
// output.ViewSpaceViewDirection = TransformWorldToViewDir(output.WorldSpaceViewDirection);
float3x3 tangentSpaceTransform = float3x3(output.WorldSpaceTangent,output.WorldSpaceBiTangent,output.WorldSpaceNormal);
output.TangentSpaceViewDirection = mul(tangentSpaceTransform, output.WorldSpaceViewDirection);
output.WorldSpacePosition = GetAbsolutePositionWS(input.positionRWS);
// output.ObjectSpacePosition = TransformWorldToObject(input.positionRWS);
// output.ViewSpacePosition = TransformWorldToView(input.positionRWS);
// output.TangentSpacePosition = float3(0.0f, 0.0f, 0.0f);
// output.ScreenPosition = ComputeScreenPos(TransformWorldToHClip(input.positionRWS), _ProjectionParams.x);
output.uv0 = input.texCoord0;
// output.uv1 = input.texCoord1;
// output.uv2 = input.texCoord2;
// output.uv3 = input.texCoord3;
// output.VertexColor = input.color;
// output.FaceSign = input.isFrontFace;
// output.TimeParameters = _TimeParameters.xyz; // This is mainly for LW as HD overwrite this value
return output;
}
If instead of commenting and uncommenting these functions, they were simply wrapped in the define checks, this function would also not need to exist in the top level pass, and could instead be #included from some file:
#if VARYINGS_NEED_WORLD_SPACE_POSITION
float3x3 tangentSpaceTransform = float3x3(output.WorldSpaceTangent,output.WorldSpaceBiTangent,output.WorldSpaceNormal);
output.TangentSpaceViewDirection = mul(tangentSpaceTransform, output.WorldSpaceViewDirection);
output.WorldSpacePosition = GetAbsolutePositionWS(input.positionRWS);
#endif
The same would be true of things like the structure definitions:
struct SurfaceDescriptionInputs
{
#if VARYINGS_NEED_WORLD_SPACE_POSITION
float3 WorldSpacePosition; // optional
#endif
#if VARYINGS_NEED_UV0
float4 uv0; // optional
#endif
};
If this was done very little code would have to exist in the actual output shader, only some defines that say what you are using from the included code and the code you actually care about.
There some squirminess about if we even have to #if around any of these- if the data is only computed in the pixel shader, then any of these values we don’t use would get stripped by the compiler. So really we don’t need a “VARYINGS_NEED_WORLD_SPACE_POSITION” define at all, since the compiler will strip those values and calculations if we don’t use them. In reality, we really only need to define what goes across the vertex->pixel stages (hull, domain, etc too), but Unity seems to output code that’s super specific here, so I’m following that pattern.
With that, a pass might look something like this:
#define ATTRIBUTES_NEED_POSITION // allow position in AttributesMesh struct
#define ATTRIBUTES_NEED_UV0 // allow uv0 in AttributesMesh struct
#define VARYINGS_NEED_UV0 // Allow/copy to SurfaceDescriptionInputs struct
#define VARYINGS_NEED_WORLD_SPACE_POSITION
#define HAS_MESH_MODIFICATIONS // Call my custom vertex function
AttributesMesh ApplyMeshModification(AttributesMesh input, float3 timeParameters)
{
Input.uv0 += timeParameters.x;
}
TEXTURE(_MainTex);
SAMPLER(sampler_MainTex);
SurfaceDescription SurfaceDescriptionFunction(SurfaceDescriptionInputs IN)
{
IN.Albedo = SAMPLE_TEXTURE2D(_MainTex, sampler_MainTex, IN.uv0);
}
That now looks really manageable for a pass, right? There’s nothing about the code we have written that could not be run in URP as well has HDRP. There’s not pages of code that exists around it, slightly modified for every shader. Just what we care about. And none of that requires anything but some refactoring of the existing code that the shader graph writes out.
Where it gets really interesting:
So if we take this a bit further, we could write a ScriptableAssetImporter which takes this code and inserts it into each pass of a templated shader file, very similar to what the graph does anyway, but without all the commenting and uncommenting of code and structure declaration. The one issue here is that some passes don’t require computing all of the code. For instance, if you’re doing a shadow caster pass, you don’t care about albedo/normals/etc, unless those components have something to do with if that pixel should be clipped or not.
Luckily in many cases the shader compiler strips most of this code for us, so it doesn’t matter so much if it’s in there. The internal functions could provide dummy data to these structures when they generally aren’t needed with defines available to override these behaviors when needed. Something like PASSSHADOWCASTER_NEED_TANGENT, if for some reason you really want a real tangent in your MeshAttributes and SurfaceFragmentInput structures, instead of dummy data the compiler can use and strip.
So at this point, if both the LWRP and HDRP shaders followed these semantics, we have a shader that gives us most of the benefits we want. We can write something simple not thinking about passes and such, we can tell it what we need in terms of mesh and pixel data and be efficient about it, and we have something compatible with both pipelines assuming we’re not using things which don’t exist in both pipelines. Additional defines could be used to select which template is used (SSS, decals, etc) and enable/disable attributes of the structure and packing routines accordingly. You’d have to wrap your assignment of those in the same checks, but that seems reasonable. You lose the ability to name things in your structures, but that honestly seems like a win to me…