Hi!
I’m working on an educational project where I’m attempting to build a simple software renderer inside Unity.
I’ve been using some of the built in matrix functions to construct an mvp-matrix and my problem is that I’m not sure which space I’m actually ending up in.
As far as i understand the transformations in a renderer should be: local => world => cam/view => clipSpace => imageSpace(NDC) => screenSpace.
The affine transformations from local to view space seem fine using unity’s built in matrices. However, when using
GL.GetGPUProjectionMatrix or any of the other perspective matrices, it seems unclear whether they actually perform the perspective divide and thus transforms each vertex to image space or not. Are there any resources on exactly what unity’s matrices do? I’m having trounle finding it in the documentation. One alternative would be to just define my own matrices but I would like to avoid that if possible.
The homogeneous perspective division is not done by the matrix. It’s something you need to do after the transformation. You can do it at any stage after clipping.
There is not much that’s Unity specific here. All you need to know is that
- Unity uses OpenGL convention where NDC space z ranges from -1 to 1 and the origin is at the bottom left
- GL.GetGPUProjectionMatrix should only be used if you need to convert the OpenGL matrix to whatever convention the current API uses. If you are writing a software renderer, you shouldn’t be using it - otherwise you’d get different results with different rendering backends. It will also flip NDC y if you are rendering to a texture if you are on a platform where the origin is at the top left.
- Unity uses reverse z if SystemInfo.usesReversedZBuffer is set (which is the case on PC). Don’t know if that comes in through GetGPUProjectionMatrix.
So it’s something along those lines (untested!)
Camera cam = Camera.main;
Matrix4x4 modelMatrix = transform.localToWorldMatrix;
Matrix4x4 viewMatrix = cam.worldToCameraMatrix;
Matrix4x4 projectionMatrix = cam.projectionMatrix;
Matrix4x4 modelViewProjection = projectionPatrix * viewMatrix * modelMatrix;
Vector4 objectSpacePosition = new Vector4(x, y, z, 1);
Vector4 clipSpacePosition = modelViewProjection * objectSpacePosition;
// Clip here against the [-1, +1] in x/y/z in clip space (without dividing by w!)
// Homogeneous division
Vector4 ndcSpacePosition = clipSpacePosition / clipSpacePosition.w;
// Viewport transform
Vector2 screenSpacePosition = (clipSpacePosition.xy * new Vector2(0.5f, -0.5f) + new Vector2(0.5f, 0.5f)) * new Vector2(cam.pixelWidth, cam.pixelHeight);
or alternatively
Matrix4x4 viewPortTransform = Matrix4x4.Scale(new Vector3(cam.pixelWidth, cam.pixelHeight, 1.0f) * Matrix4x4.Translate(new Vector3(0.5f, 0.5f, 0.0f)) * Matrix4x4.Scale(new Vector3(0.5f, -0.5f, 1.0f));
Vector4 screenSpacePosition = viewPortTransform * new Vector4(ndcSpacePosition.x, ndcSpacePosition.y, 0, 1.0f);
If you know that no clipping will be necessary, you can do it all in one step:
Matrix4x4 modelViewProjectionViewport = viewPortTransform *projectionPatrix * viewMatrix * modelMatrix;
Vector4 screenPosition = modelViewProjectionViewport * objectSpacePosition;
screenPosition /= screenPosition.w;
Note that this is basically what Camera.WorldToScreenPoint does.
Now the more interesting part is how the clipping is done. See here for more information:
https://fabiensanglard.net/polygon_codec/clippingdocument/Clipping.pdf (chapter 8, Clipping in Projective Space)
PS: All of this is actually documented here: Unity - Manual: Writing shaders for different graphics APIs
Many thanks for helping me out! Your solution is much cleaner than what i was trying to do myself, and most importantly it works now. Also, thanks for explaining the platform convention stuff, I think that was what caused my confusion before.
Now it’s time to clip some triangles!
PS: I might have been a bit lazy about reading the documentation haha