I am trying to extract the pixel value of the depth image generated by AROcclusionManager, but can only get 1,1,1,1 for RGBA in the pixel value, I took a look at this thread: Environmental Occlusion and Depth in ARFoundation and saw the solution proposed by j0schm03 on #66. However, I need to extract the information in real-time (just the pixel value no need to convert it to another rendering texture), and the solution makes my app lag seriously. I saw that we can also use TryAcquireEnvironmentDepthCpuImage to access the XRCpuImage and acquire the value from there but I was unable to make it work even after reading the document: Struct XRCpuImage | AR Subsystems | 4.0.12. As there is no method that can directly access the value. Is there anyone who also come across using XRCpuImage before who can help?
Or is there a way to make j0schm03’s method faster?
The ARFoundation version is 4.1.7 and the platform I would like to use it on is Android.
The TryAcquireEnvironmentDepthCpuImage() seems like the correct and the most efficient way to access the depth information on CPU. Could you please share your script that uses the TryAcquireEnvironmentDepthCpuImage? I’ll try to understand why it doesn’t work.
I don’t have the framework for code because I was unable to understand the way to do it.
In my understanding from the document, the struct XRCpuImage has the properties: height, width, format, dimensions, timestamp, planeCount, and valid. And methods that convert the XRCpuImage into other supported formats. But there does not seem to be a way for direct access of values.
However I do not need the texture format, just the value, and building the entire rendering texture/ texture2d seems to affect performance, as the code block is likely to be placed in a script similar to ARPointCloudVisualizer which will be called frequently upon all updates in the point cloud, is there a way to get around building such texture?
Here is an example script I wrote for you
It prints the depth in meters at the depth texture’s center. It is the most efficient way to retrieve the information and you can execute the code every frame.
using System;
using System.Collections;
using UnityEngine;
using UnityEngine.Assertions;
using UnityEngine.XR.ARFoundation;
using UnityEngine.XR.ARSubsystems;
public class GetDepthOfCenterPixel : MonoBehaviour {
// assign this field in inspector
[SerializeField] AROcclusionManager manager = null;
IEnumerator Start() {
while (ARSession.state < ARSessionState.SessionInitializing) {
// manager.descriptor.supportsEnvironmentDepthImage will return a correct value if ARSession.state >= ARSessionState.SessionInitializing
yield return null;
}
if (!manager.descriptor.supportsEnvironmentDepthImage) {
Debug.LogError("!manager.descriptor.supportsEnvironmentDepthImage");
yield break;
}
while (true) {
if (manager.TryAcquireEnvironmentDepthCpuImage(out var cpuImage) && cpuImage.valid) {
using (cpuImage) {
Assert.IsTrue(cpuImage.planeCount == 1);
var plane = cpuImage.GetPlane(0);
var dataLength = plane.data.Length;
var pixelStride = plane.pixelStride;
var rowStride = plane.rowStride;
Assert.AreEqual(0, dataLength % rowStride, "dataLength should be divisible by rowStride without a remainder");
Assert.AreEqual(0, rowStride % pixelStride, "rowStride should be divisible by pixelStride without a remainder");
var centerRowIndex = dataLength / rowStride / 2;
var centerPixelIndex = rowStride / pixelStride / 2;
var centerPixelData = plane.data.GetSubArray(centerRowIndex * rowStride + centerPixelIndex * pixelStride, pixelStride);
var depthInMeters = convertPixelDataToDistanceInMeters(centerPixelData.ToArray(), cpuImage.format);
print($"depth texture size: ({cpuImage.width},{cpuImage.height}), pixelStride: {pixelStride}, rowStride: {rowStride}, pixel pos: ({centerPixelIndex}, {centerRowIndex}), depthInMeters of the center pixel: {depthInMeters}");
}
}
yield return null;
}
}
float convertPixelDataToDistanceInMeters(byte[] data, XRCpuImage.Format format) {
switch (format) {
case XRCpuImage.Format.DepthUint16:
return BitConverter.ToUInt16(data, 0) / 1000f;
case XRCpuImage.Format.DepthFloat32:
return BitConverter.ToSingle(data, 0);
default:
throw new Exception($"Format not supported: {format}");
}
}
}
I tried to implement the code you suggested and it worked smoothly BUT when I tried to extend it to more points other than the center point the depth information seems to be wrong. My approach is to convert a world coordinate to screen position and use it as the row and pixel index as shown below:
//Convert World position to Screen Position
Vector3 screenpos = aRCamera.WorldToScreenPoint(pos);
//Extract correct depth using depth API
if (aROcclusionManager.TryAcquireEnvironmentDepthCpuImage(out var cpuImage) && cpuImage.valid)
{
using (cpuImage)
{
Assert.IsTrue(cpuImage.planeCount == 1);
var plane = cpuImage.GetPlane(0);
var dataLength = plane.data.Length;
var pixelStride = plane.pixelStride;
var rowStride = plane.rowStride;
Assert.AreEqual(0, dataLength % rowStride, "dataLength should be divisible by rowStride without a remainder");
Assert.AreEqual(0, rowStride % pixelStride, "rowStride should be divisible by pixelStride without a remainder");
var RowIndex = (int)screenpos.y;
var PixelIndex = (int)screenpos.x;
var PixelData = plane.data.GetSubArray(RowIndex * rowStride + PixelIndex * pixelStride, pixelStride);
var depthInMeters = CPUImageWrap.convertPixelDataToDistance(PixelData.ToArray(), cpuImage.format);
screenpos.z = depthInMeters;
}
}
//Convert new depth position to World Position
Vector3 posNew = aRCamera.ScreenToWorldPoint(screenpos);
Is it because we cannot directly convert the screen pixel to cpuimage pixel? What can I do to make it work? Or is there a conversion scheme from screen to cpuimage?
The depth texture is typically smaller than the actual screen resolution. So you have to scale your screen coordinates:
var depthTextureX = (int) (cpuImage.width * (screenPos.x / Screen.width));
var depthTextureY = (int) (cpuImage.height * (screenPos.y / Screen.height));
var pixelData = plane.data.GetSubArray(depthTextureY * rowStride + depthTextureX * pixelStride, pixelStride);
Moreover, the depth texture is not aligned with the screen coordinates, so you have to apply the displayMatrix depth texture coordinates if you’re looking for the perfect precision: https://discussions.unity.com/t/837130/4
After changing the code into your suggested code introduce a build error suddenly, I was sure that the error didn’t exist before changing that few lines, not sure what happened even after looking at the console. Do you have any idea?