How to test implementation of 3D rendering system and how w division works?

  1. I’m trying to learn more about rendering process and I’m trying to create the matrices from scratch. And this is how I made them:
        public static Matrix4 Translation(Vector3 position)
        {
            return new Matrix4(
                                 new Vector4(1, 0, 0, 0),
                                 new Vector4(0, 1, 0, 0),
                                 new Vector4(0, 0, 1, 0),
                                 new Vector4(position, 1),
                                 MatrixCtor.Line);
        }
        public static Matrix4 Scalation(Vector3 scale)
        {
            return new Matrix4(
                                 new Vector4(scale.x, 0, 0, 0),
                                 new Vector4(0, scale.y, 0, 0),
                                 new Vector4(0, 0, scale.z, 0),
                                 new Vector4(0, 0, 0, 1),
                                 MatrixCtor.Line);
        }

The constructor for the custom class Matrix4 uses 4 vectors and an enum that specifies if the vectors are meant to be lines or columns.

So
Matrix.Translation(new Vector3(2, 2, 3)) will return:
| 1 0 0 0 |
| 0 1 0 0 |
| 0 0 1 0 |
| 2 2 3 1 |
The same for scaling.
I didn’t implement the rotation yet because I want a better understanding of quaternions first.
Now to the view matrix this is how I defined it:

        public static Matrix4 Camera(Vector3 eye, Vector3 at, Vector3 up)
        {
            Matrix4 view = Matrix4.Identity;
            Vector3 zaxis = (eye - at).normalized;
            Vector3 xaxis = Vector3.Cross(up, zaxis).normalized;
            Vector3 yaxis = Vector3.Cross(zaxis, xaxis);
            return new Matrix4(new Vector4(xaxis, Vector3.Dot(xaxis, eye)),
                                 new Vector4(yaxis, Vector3.Dot(yaxis, eye)),
                                 new Vector4(zaxis, Vector3.Dot(zaxis, eye)),
                                 new Vector4(0, 0, 0, 1), MatrixCtor.Column);
        }

Now since I’m using row-vector multiplication I don’t know if MatrixCtor shoud be Column or Line, but I tried both of them.

For perspective projection I used:

        public static Matrix4 Perspective(float aspect, float near, float far, float fov)
        {
            Matrix4x4 perspective = Matrix4.Null;
            perspective.M00 = 1f / aspect * (float)Mathf.Tan(fov / 2f);
            perspective.M11 = 1f / (float)Mathf.Tan(fov/2f);
            perspective.M22 = -(float)(far + near) / (float)(far - near);
            perspective.M32 = -1f;
            perspective.M23 = (-2f * far * near) / (float)(far - near);
            return perspective;
        }

Now I did the perspective division:

        Vector3[] vertices = new Vector3[36]
        {
        ... VAO coordinates for a cube, I didn't implemented yet indexing
        }
        Matrix4 Pos = Matrix4.Translation(new Vector3(0, 0, 0));
        Matrix4 Sc = Matrix4.Scalation(new Vector3(1, 1, 1));
        Matrix4 M = Pos * Sc;
        Matrix4 V = Matrix4x4.Camera(new Vector3(0, 0, -10), new Vector3(0, 0, 1), new Vector3(0, 1, 0));
        Matrix4 P = Matrix4x4.Perspective(1, 0.03f, 1000f, 40f);
        Matrix4 MVP = M * V * P; //row-vector order
       
        for (int i = 0; i < 36; i++)
        {
            Vector4 position = new Vector4(vertices[i], 1) * MVP; //row-vector order
            position.x /= position.w;
            position.y /= position.w;
            position.z /= position.w;
            screen[i] = new Vector2(position.x , position.y);
            screen[i].ToScreen();
        }
        Project(screen)

Now from NDC coordinates I tried to remap to screen coordinates:

public static Vector2 ToScreen()
        {
            return new Vector2((x + 1) * bitmap.Width / 2, Math.Abs((y + 1) * bitmap.Height / 2 - bitmap.Height));
        }

After this I construct a bitmap by rasterizing the pixels and then display it in an unity Texture2D.
Something is wrong in this process, my question is how to test each step and see where is the problem and the second question why I see on every website that x, y, z clip space coords will be divided by w to obtain NDC if anyway I loose the z coordinate when I convert to screen coodinates.

It’s not really clear what your goal is. However when you roll your own types, there are several decisions you have to make and setup certain conventions you should stick to. We don’t really see your types and don’t know if operators are implemented correctly or not.

First of all matrices can be laid out in different ways and the order of matrix multiplication also determines the layout. As you may know, a matrix - matrix multiplication is actually no different from a matrix - vector multiplication. A Vector4 is essentially a 1 by 4 matrix (or a 4 by 1). Depending on how it’s treated determines the order and layout of the matrix. Usually the typical M * V order implies that each vector is treated as a column vector

Unity uses a column major layout. I would strongly recommend my matrix crash course I posted over here. In that answer I actually explain how the perspective projection matrix works.

We don’t know what the constructor of your Matrix4 type actually takes. If it takes 4 column vectors, your printout is wrong, it’s transposed (you show the columns as rows). So make sure your types are actually implemented correctly. Those things can be tested in isolation.

The next issue I see is your “Camera” method. The camera / view matrix is simply the inverse of the transform of the camera. Though your Camera method seems to create a transformation matrix based on a look-at target. However your naming seems to be inconsistent and it’s not clear what “eye” and what “at” actually means. I would highly suggest you use more descriptive names. As I said, the view / camera matrix should be the inverse of your desired view. The matrix you create seems to do treat “eye” and “at” strangely. You never really calculate the inverse of your view matrix.

You generally seem to go straight against the general convention of having a column major layout and vectors are usually multiplied on the right as column vectors. You multiply them on the left as row vectors. So almost no examples you will find would match your layout. All your matrices would need to be transposed and all multiplication order would be backwards.

I would highly recommend to watch the video series on linear algebra by 3B1B on YT.

I haven’t really looked into detail if what you’re doing is actually correct. We actually can not determine that since a lot code is missing. You also use unusual terminology. I guess “line” refers to a “row” of a matrix?

I just read your final question:

I’m not quite sure I understood what you’re actually asking here. We use 4d homogeneous coordinates in order to represent rotation and scaling alongside translation which would not be possible in plain linear algebra as the origin is always fixed in linear algebra. Homogeneous coordinates simply offset our 3d space by 1 into the 4th dimension which allows us to do a shear operation in 4d space which will result in a translation in 3d space after projecting back down. That’s what actually happens when you renormalize the w coordinate. You don’t loose the z coordinate as the z coordinate will contain the information that goes into the z buffer. What z contains after the perspective divide depends on your projection matrix. The value in z is usually determines by the M22 and M32 (would be M23 in your case). M22 scales the z value while the other adds an offset. In the end we divide by w which is just the value of z which will map the z value into the range -1 to 1. This is a non linear mapping but it works. So if z is “near” you get the value of -1, if z is “far” you get the value of +1. This is what is written into the depth buffer.

ps: Not that it matters for your usecase, but you currently calculate your M as M = Pos * Sc;. Since you use row vectors and left hand multiplication, you would first offset your vector and then scale it. A matrix should do the position offset at the end. You first rotate and scale and then add the position afterwards. Since your scale is currently (1,1,1) it doesn’t really matter, but it doesn’t fit your multiplication order.

We haven’t yet talked about what coordinate space you actually assume. Unity uses a left hand coordinate system where +z is indicating the forward direction, +y is up and +x is right. Though on the GPU, depending on the used graphics API the matrix may be flipped into a right handed matrix which is usual for OpenGL Unity usually handlles that for you behind the scenes. However it gets relevant when you look at it from the shaders perspective.

I once created a 5x5 matrix in order to work with 4d homogeneous coordinates to render a hypercube. Though I sticked as close as possible to Unity’s layout and conventions. Things are already difficult enough. When you need to convert between different coordinate systems, layouts and multiplication orders things simply get way too messy. Certainly doable when you have all the information at hand, but not really fun ^^.