The technique currently commonly known as “MatCap”, as it was called for ZBrush, has been around since the 90’s in various forms. MatCap is originally called the Lit-Sphere model. That paper you linked to references it directly, and pairs it with something like Environment Mapped Bump Mapping, a technique that predates even the Lit-Sphere model. It was a defining feature of the Matrox G400 released in 1999. The main idea with environment mapping was you took a square texture and mapped it onto a surface in a way that looked like a reflection. Today people will generally just use a cubemap as it looks better, more accurate, and probably in some ways cheaper to calculate.
Now, to get to your actual question. The main thing that MatCap shaders do is take a normal and convert that into view space, or view direction space, and then use that normal to sample texture. The above technique is doing the exact same thing, but the “view” is the light orientation. Unity doesn’t usually pass the full transform for the light, just the position or in the case of directional lights, just a direction vector. But you can reconstruct one if you’re okay with assuming the light is always “world up” like most cameras are.

Shader "Custom/LitSphereModel"
{
Properties
{
_Matcap ("Matcap", 2D) = "white"
}
SubShader
{
Tags { "RenderType" = "Opaque" }
Pass
{
Tags { "LightMode"="ForwardBase" }
CGPROGRAM
#pragma vertex vert
#pragma fragment frag
#include "UnityCG.cginc"
struct v2f
{
float4 pos : SV_POSITION;
float3 cap : TEXCOORD0;
};
sampler2D _Matcap;
v2f vert (appdata_full v)
{
v2f o;
o.pos = UnityObjectToClipPos(v.vertex);
float3 forward = _WorldSpaceLightPos0.xyz;
float3 up = float3(0,1,0);
float3 right = normalize(cross(up, forward));
up = -normalize(cross(right, forward));
float3x3 worldToLight = float3x3(right, up, forward);
float3 worldNormal = UnityObjectToWorldNormal(v.normal);
o.cap = mul(worldToLight, worldNormal);
return o;
}
fixed4 frag (v2f i) : SV_Target
{
if (i.cap.z < 0.0)
i.cap.xy = normalize(i.cap.xy);
else
i.cap = normalize(i.cap);
fixed4 col = tex2D(_Matcap, i.cap.xy * 0.5 + 0.5);
return col;
}
ENDCG
}
}
}
The problem, and something they don’t appear to address in the paper you linked, is what to do with stuff on the “back” of the model. All the examples they show appear to have an intentionally solid color, and I suspect that’s on purpose because it hides the fact it does this on the back side for any texture with detail around the outer edge:
That could probably be solved by using a stereographic projection instead of a straight remapping of the normal like I’m doing, but then it’s not really compatible with matcaps, though arguably neither is the above shader.
I’m also not doing any of the faked specular. That would be taking the reflection vector and transforming it into light space as well, so that’s not a ton more work, but too lazy at the moment.