Hey guys, I’m fairly new to Graphics, so please be gentle :).
In my world, I have a camera that captures images, and I’m trying to find a way to get the real distance between every pixel in my camera’s view and the camera, and store that off into a file. I saw something similar attempted with shaders here:
however I’m not sure if this worked for the user or if it’s even the same as to what I’m trying to do. Currently I created a Depth Shader that creates a Z-Buffer Image and I’m able to get the R G B grey-scale values which map to some sort of distance to my image. I was thinking I could use my near and far clipping planes to interpolate and calculate the distance based off of the grey-scale color, but I’m not sure if that’s on the right track.
Is Shaders the right approach? If so, could anyone point me in the right direction as to what I’d need to do to get my real world distance? If not, I’ve also heard about ray casting and saw Unity’s Camera.ScreenToWorldPoint function and was wondering if those could be useful in some way.
Hopefully my question isn’t too confusing. Thanks!
I looked into ray casting, it seems that I can get the distance between a camera and an object, not sure if it’s possible to get the distance between the camera and each pixel. Do you know if that’s possible. I also tried Camera.ScreenToWorldPoint and it seems to give me the same Vector for any pixel I try. Maybe I’m approaching it wrong…
What I did was store the texture for my image and then I read all the pixels from it and did Camera.ScreenToWorldPoint on each pixel from that array.
Camera.ScreenToWorld takes an on screen pixel position and a z distance from the camera and outputs the world position for a position that’s at that pixel and that distance from the camera. It does not know anything about the scene being rendered. All of the similar methods on the camera are like that, they’re just applying transform matrices for you to switch between relative spaces.
Ray casts only work against colliders in the scene. It has no knowledge of what is rendered on screen. You could conceivably make everything in the scene use a mesh collider and get something pretty close, but this will be very slow.
Rendering out depth to a texture is the correct way to do this if you actually want to get the depth of the pixels on screen. Unity usually renders out a depth texture already, and you could copy that texture to the CPU by using a blit to copy the depth texture into your own render texture, and then use GetPixels() to get that data onto the CPU, but this will be slow too (though not as slow as ray casting the every pixel on screen).
However in general I would say if you’re trying to read pixel values on the CPU, you’re probably “doing it wrong”.
I’m not sure if I have exactly what you meant. I’m able to get a rendering like this:
I’m also able to get each pixel’s RGBA, just not sure how to get the real world distance.
Also I can get this outputted in an EXR format for higher precision if that helps with anything
The question is what are you rendering. Is that the depth buffer, or view depth. If it’s the depth buffer / clip space z then it’s a non-linear depth and you need to look up how to transform a depth buffer into linear space (hint, you need the near and far clipping distances for the camera).
You’re also going to run into precision issues as depth buffers are generally single 24 or 32 bit floats, but EXR is only 16 bit per channel at best, and everything else (that Unity has built in support for writing out to) is 8 bit per channel. You may need to do something like what the CamerDepthNormals texture does and encode the depth over multiple channels so you can get better precision but still store in an 8 bpc texture format.
That is outputting normalized view depth; you’re using Linear01Depth, which means at the camera is zero, and the far plane is 1, and it’s a linear falloff between. If your camera’s far plane is 1000 and an RGBA32 texture, a color value of 1/255 is 4~8 meters away (1/255 * 1000). Your far plane looks to be a lot shorter than that, which is good, and I don’t know what kind of accuracy you want or need.
Yeah, I have my far clipping set to 40. So if I’m understanding this right, my depthValue is actually related to the real world depth, just not scaled properly, and all I’d have to do is scale it appropriately? Regarding accuracy, I’m trying to get at least 16-bits per channel which should be fine with the EXR encoding.