Problem Solving: 2D Billboard Sprites clipping into 3D environment

Hey guys, I’ve been experimenting with building a tactics engine using 3D geometry but 2D character sprites, similar to the Final Fantasy Tactics and Disgaea series. I’m using a simple shader for billboard sprites that always face the camera, and this works quite well in most cases.

Using one of their sprites as a placeholder, you can see they render fine at a shallow angle:

And are properly occluded by objects in front:

However, at any steeper camera angle, if they’re close to a vertical wall, the sprite tilts back to maintain its facing, and is clipped through by the wall.

This is realistic behaviour for a flat plane, but the sprites have their perspective baked in, and I need to avoid this clipping somehow. All of the sprite should render over objects further back than its bottom pixel, but still be occluded by obstacles in front.

Is there some way I can achieve this or avoid the clipping issue another way, short of designing my 3D space around it?

Thanks in advance!

EDIT: Including the shader I used for reference.

Shader "Unlit/Sprite_Billboard_Unlit"
{
    Properties
    {
        _MainTex("Texture", 2D) = "white" {}
    }

    SubShader
    {
        Tags{ "Queue" = "Transparent" "IgnoreProjector" = "True" "RenderType" = "Transparent" "DisableBatching" = "True" }

        ZWrite Off
        Blend SrcAlpha OneMinusSrcAlpha

        Pass
        {
            CGPROGRAM
            #pragma vertex vert
            #pragma fragment frag
            // make fog work
            #pragma multi_compile_fog

            #include "UnityCG.cginc"

            struct appdata
            {
                float4 vertex : POSITION;
                float2 uv : TEXCOORD0;
            };

            struct v2f
            {
                float2 uv : TEXCOORD0;
                UNITY_FOG_COORDS(1)
                float4 pos : SV_POSITION;
            };

            sampler2D _MainTex;
            float4 _MainTex_ST;

            v2f vert(appdata v)
            {
                v2f o;
                o.pos = UnityObjectToClipPos(v.vertex);
                o.uv = v.uv.xy;

                // billboard mesh towards camera
                float3 vpos = mul((float3x3)unity_ObjectToWorld, v.vertex.xyz);
                float4 worldCoord = float4(unity_ObjectToWorld._m03, unity_ObjectToWorld._m13, unity_ObjectToWorld._m23, 1);
                float4 viewPos = mul(UNITY_MATRIX_V, worldCoord) + float4(vpos, 0);
                float4 outPos = mul(UNITY_MATRIX_P, viewPos);

                o.pos = outPos;

                UNITY_TRANSFER_FOG(o,o.vertex);
                return o;
            }

            fixed4 frag(v2f i) : SV_Target
            {
                // sample the texture
                fixed4 col = tex2D(_MainTex, i.uv);
                // apply fog
                UNITY_APPLY_FOG(i.fogCoord, col);
                return col;
            }
            ENDCG
        }
    }
}

I would like to ask about possible solutions to this, too.

I’m fairly new to shaders so forgive me if I’m asking something stupid, but is there a possibility to assign a single z-value to all of the pixels of an object?

Or maybe there’s another type of solution to this? Maybe something to do with these stencil settings?

i’d try with setting sortingorder (based on z coordinate perhaps)

similar to when using sprites,

A screen facing sprite already has the same z across the entire surface, that’s in fact kind of the problem.

Stencils are useful for masking whether something should or shouldn’t show in an area of the screen. But this is a 2D screen space thing, you’d have to sort everything manually.

Won’t matter, because the depth buffer is the problem here. The sprite is intersecting with the depth buffer, and the entire purpose of the depth buffer is to produce identical results regardless of the sorting order.

The “simple” solution is don’t put the sprite’s pivot on the ground. Put it at the center of the space they should be standing on and don’t let them get so close to the blocks to intersect.

Alternatively use vertically aligned sprites rather than camera facing sprites, that way they won’t ever intersect with the 3D objects due to vertical camera orientation. I believe this is the solution Octopath Traveler uses.

The more advanced shader only option would be to use a shader that pushes the sprite towards the camera by some world space amount. This would avoid them clipping into geometry they’re in front of, but may cause them to clip into geometry they’re behind.

The even more advanced shader only option would be to output the z as if the sprite is a vertically aligned sprite, but keep the appearance of a screen space sprite.

However, what I think both Disgaea and FFT do is sort all objects & ground tiles as if they were still 2D sprites. The character sprites never intersect with 3D geometry, they’re always either on top or behind geometry, which means they’re not using the depth buffer at all for the sprites (ZTest always). But each 3D object properly sorts with itself and those around it, and vfx clip all over the place, so depth is obviously still being used.

2 Likes

Most games I have seen doing this 2d/3d mix will be designed from the start with a tightly constrained camera tilt baked into the geometry. That means modeling that cube “leaning back” so it still feels like a cube but doesn’t lean forward into the paper-thin elements.

Like this (though this game doesn’t use sprites).

It should be noted that both FFT and Disgaea use orthographic cameras to help make stuff easier. Paper Mario uses fully 3D worlds, and camera facing sprites, but they also limit the camera movement significantly, and let sprites clip through objects; a bit of jank between the 2D and 3D elements is part of the game’s shtick.

Some other games with supposed mixed 2D and 3D elements cheat, and the 3D elements are flattened or otherwise rendered as sprites.

A tradition as old as Donkey Kong Country.

1 Like

Some modern games ship with real animated 3D geometry and render them to sprites in real time. Some recent “2D” fighting games on consoles, or Brawl Stars on mobile.

Thanks for the replies everyone! I have seen that method of tilting the levels, but I would like to keep physics on if it’s at all possible. That seems like an interesting way to do it, though.

This is actually what I meant in my first reply when I said about the z-value of the pixels, but I guess I explained it wrong? I meant coding a custom shader and in the fragment shader modify the pos.z value. Is that what you meant, too?

Would that be plausible? If so, how can I reference a point in space to the fragment shader?

You’d probably want to do it in the vertex shader. Much cheaper and easier to do it there. However regardless of if you do it in the vertex or fragment shader, this hack will cause problems if your camera angle looks down or up too far, as eventually the appropriate z position goes beyond infinity, and before that well outside of the near and far clipping planes. That can be solved by clamping, but it’s something to be wary of. Think about looking straight down on a normal camera facing billboard and a really, really big vertical billboard. Any place you can’t see the vertical billboard there’s no valid mathematical solution to where to put the camera facing billboard’s “z” where it’ll still be visible, so at some point you just have to clamp the values to something reasonable, or maybe blend to using a different technique when the angle gets too high (like simply offsetting towards the camera).

Now how to do it, you’d have to first calculate the position for each vertex of the screen facing sprite, like you’re already doing. Then you’d take the world space view direction and do the math for a ray plane intersection, using the sprite’s pivot position as a point on the plane, and the y flattened, negative camera forward vector as the plane’s normal. Then you’d calculate the clip space position for both of those values, and override the z of the screen facing with the calculated vertical billboard’s “position”.

float4 viewPos = // what you have above.
float4 outPos = mul(UNITY_MATRIX_P, viewPos);

float3 planeNormal = -normalize(UNITY_MATRIX_V._m20, 0.0, UNITY_MATRIX_V._m22);
float3 planePoint = unity_ObjectToWorld._m03_m13_m23;
float3 rayStart = _WorldSpaceCameraPos.xyz;
float3 rayDir = normalize(mul(UNITY_MATRIX_I_V, viewPos).xyz - rayStart); // convert view to world, minus camera pos
float dist = rayPlaneIntersection(planeNormal, planePoint, rayDir, rayStart);

float4 planeOutPos = mul(UNITY_MATRIX_VP, float4(rayStart + rayDir * dist, 1.0));

outPos.z = planeOutPos.z / planeOutPos.w * outPos.w;
2 Likes

Random thing, the rayPlaneIntersection function isn’t one that ships with Unity. You’d have to implement it yourself.
Or steal the one I used in this shader, though the order of the ray and plane inputs are backwards from the above code:

Thank you for your reply bgolus. I have to take some time and go through your code with thought to understand it. I’m fairly bad at matrices and vector math so I´ll try to implement your example and post my findings.

Actually Unity does ship with one, it’s this one https://docs.unity3d.com/ScriptReference/Plane.Raycast.html - you just need to supply a Plane for it, which is easy enough.

That’s a c# function. This is about doing raytracing in a shader which uses HLSL and can’t use c#.

Just curious, but have you had any success with that shader method? I was looking for this very thing when I first posted (sort z-depth as if the sprite is aligned to the up axis, but display the sprite as if it’s facing the angled camera straight on).

Unfortunately, I’m also not particularly versed in writing shaders - but I can take another crack at this to try it out if nobody else has done so.

In my case, there’s no worry that the camera angle will ever become too steep. I’ve locked it at 45 degrees or shallower, which matches the hand-drawn perspective for the sprites I’m using.

What I find interesting about this is that I’ve definitely seen games solve the issue in a non-disruptive way, but as far as I can tell it’s never been properly documented online. Take Recettear for example - which is an indie game itself:

The camera definitely has perspective, though it’s slight. Character sprites can almost hug the walls without clipping - and in the case of the 3D chest model you have to be literally standing inside it before inevitable clipping occurs (which it would on a sprite standing upright as well). That you can stand both in front and behind the chest and clip partially also indicates the z-buffer is still being used, doesn’t it?

You could rotate the sprite to a normal vertical position. This “compresses” the sprite and it looks shorter, little distorted.
Then, you just scale the sprite on the y axis. This should only work on orthographic cameras. You need to calculate how much the sprite needs to be stretched. For example, if your cameras rotation x-axis looks town at 60 degrees. Your compressed vertical sprite needs to be stretched to exactly 2, on the y axis. Works for me pixel perfectly (if that’s a word). You might have to watch out for proper floor contact and 3d depth, but it worked pretty good for me.

Oh and of course, it only works with a fixed camera angle, at least at the x axis. If you want to rotate the cam on the y axis, you need a proper interpolation for the sprites, to rotate with it, like in ragnarok online.

Here’s a working version of the shader with my coding errors fixed.
BillboardVerticalZDepth.shader

Shader "Unlit/BillboardVerticalZDepth"
{
    Properties
    {
        _MainTex("Texture", 2D) = "white" {}
    }

    SubShader
    {
        Tags{ "Queue" = "Transparent" "IgnoreProjector" = "True" "RenderType" = "Transparent" "DisableBatching" = "True" }

        ZWrite Off
        Blend SrcAlpha OneMinusSrcAlpha

        Pass
        {
            CGPROGRAM
            #pragma vertex vert
            #pragma fragment frag
            // make fog work
            #pragma multi_compile_fog

            #include "UnityCG.cginc"

            struct appdata
            {
                float4 vertex : POSITION;
                float2 uv : TEXCOORD0;
            };

            struct v2f
            {
                float4 pos : SV_POSITION;
                float2 uv : TEXCOORD0;
                UNITY_FOG_COORDS(1)
            };

            sampler2D _MainTex;
            float4 _MainTex_ST;

            float rayPlaneIntersection( float3 rayDir, float3 rayPos, float3 planeNormal, float3 planePos)
            {
                float denom = dot(planeNormal, rayDir);
                denom = max(denom, 0.000001); // avoid divide by zero
                float3 diff = planePos - rayPos;
                return dot(diff, planeNormal) / denom;
            }

            v2f vert(appdata v)
            {
                v2f o;
                o.pos = UnityObjectToClipPos(v.vertex);
                o.uv = v.uv.xy;

                // billboard mesh towards camera
                float3 vpos = mul((float3x3)unity_ObjectToWorld, v.vertex.xyz);
                float4 worldCoord = float4(unity_ObjectToWorld._m03, unity_ObjectToWorld._m13, unity_ObjectToWorld._m23, 1);
                float4 viewPos = mul(UNITY_MATRIX_V, worldCoord) + float4(vpos, 0);

                o.pos = mul(UNITY_MATRIX_P, viewPos);

                // calculate distance to vertical billboard plane seen at this vertex's screen position
                float3 planeNormal = normalize(float3(UNITY_MATRIX_V._m20, 0.0, UNITY_MATRIX_V._m22));
                float3 planePoint = unity_ObjectToWorld._m03_m13_m23;
                float3 rayStart = _WorldSpaceCameraPos.xyz;
                float3 rayDir = -normalize(mul(UNITY_MATRIX_I_V, float4(viewPos.xyz, 1.0)).xyz - rayStart); // convert view to world, minus camera pos
                float dist = rayPlaneIntersection(rayDir, rayStart, planeNormal, planePoint);

                // calculate the clip space z for vertical plane
                float4 planeOutPos = mul(UNITY_MATRIX_VP, float4(rayStart + rayDir * dist, 1.0));
                float newPosZ = planeOutPos.z / planeOutPos.w * o.pos.w;

                // use the closest clip space z
                #if defined(UNITY_REVERSED_Z)
                o.pos.z = max(o.pos.z, newPosZ);
                #else
                o.pos.z = min(o.pos.z, newPosZ);
                #endif

                UNITY_TRANSFER_FOG(o,o.pos);
                return o;
            }

            fixed4 frag(v2f i) : SV_Target
            {
                fixed4 col = tex2D(_MainTex, i.uv);
                UNITY_APPLY_FOG(i.fogCoord, col);

                return col;
            }
            ENDCG
        }
    }
}

Viewed from above
5689456--593953--upload_2020-4-9_1-32-54.png

And an example of just how close to that block the quad actually is.
5689456--593956--upload_2020-4-9_1-35-10.png

14 Likes

In case no one understood my suggestion (works only on orthographic cameras!):

  • Rotate your camera to 30,0,0
  • Put your sprite into a vertical position, so it can stand close beside a 3d cube without clipping inside
  • Scale your sprite to exact 1,1.154251,1 // cos(30rad)

Your sprite now looks exact as he would be rotate by 30 degrees, but now, it won’t clip inside the cube.
It might not be an option for everyone, but it’s a pretty easy solution.

Yeah, there’s a lot of solutions that involve just moving the sprite towards the camera a little.

Yep, that’s part of it. There’s a graphics programmer idiom of “always have an even number of sign errors.” :wink:

That prevents the bottom edge of the sprite from going through the floor. Like the top is being effectively pulled towards the camera by the plane projection, the bottom edge is being pushed away. Works fine when there’s no floor, but with one the above example image had the tongue down under the floor.