It seems that removing the concept of the near clipping plane would not affect the calculations.

A point in camera space with coordinates (X, Y, Z) has NDC coordinates of:

`(X / (tan(Fov / 2) * Z * Aspect), Y / (tan(Fov / 2) * Z), Z / Far)`

Here, `Fov`

is the vertical field of view of the camera, `Aspect`

is the aspect ratio of the screen, and `Far`

is the far clipping plane. The mapping of Z might consider the near clipping plane, but here focuses on the mapping of X and Y. [EDIT: see next posts]

Explain how my calculation works. In the diagram below, `H = tan(Fov / 2) * Z`

. The vertical coordinate in NDC of the point is `Y / H`

, and the horizontal coordinate is similar.

Now I understand that the main limitation comes from Z. Z in NDC must be the result of a linear transformation of 1/Zview to correctly interpolate in screen space.

If it were possible to change the way Z is transformed to depth of NDC, the whole process would be much simpler. For example, depth = C / Wclip, where C is a constant. This way, n and f are only used for clipping, and the precision of depth is not affected by them. n can be very small, as long as it is greater than 0. This requires using floating-point depth, because now the theoretical range of depth is (0, infinity). The value of C can be studied to see what value allows for the most uniform distribution of floating-point depth.

But unfortunately, this seems to require support from graphic API to achieve.

Yes, but the main problem comes from the fact that matrix multiplication can not do division at all. The reason why we can represent projections at all is because we use 4d homogeneous coordinates and the GPU performs the homogeneous divide at the end which allows us to perform some sort of division on the whole coordinate.

Just for the x and y projection we just need to divide by z and call it a day. However to calculate a meaningful z value for the z buffer it becomes a bit of an issue. Currently the z buffer is non-linearly distributed which is a blessing and a curse at the same time. It’s actually great to have a better depth resolution close to the camera and less resolution the further you get a way from the camera. Though at large distances it can create problems in the distance. Though if you would be able to use a linear depth buffer (as you suggested, though which isn’t possible with matrix multiplications) the depth resolution close to the camera would be bad. In your case when you assume a far distance of 5000 units, you would have a rough subdivision of 200 per unit which could be problematic when you render a surface close to the camera. So the inverse relationship solves many issues close to the camera, but at the same time “wastes” a lot of range in the depth buffer and limits how far you can / should set your far clipping plane. The smaller the near clipping plane, the worse this distribution gets. So for best results make the far clipping plane as close as possible and the near clipping plane as far as possible

ps: over here I posted my matrix crash course (and here’s a mirror on github)