MTLPixelFormat, Unity TextureFormat to access Buffer of Depth information

Today i successfully added the following functionality to Unity ARKit Plugin:

  • Create another Buffer for the Depth information
  • Hook everything up on the unity side
  • create a shader the depth info this as a mask.

Purpose: Have the video texture (as present) but use Depth as a mask to automatically cut out persons and reveal other behind the person.

Current state: The depth buffer seams to be read out, can be read on the unity side, is masking the video texture.

Unfortunately the depth image is rather garbled, because i don’t know about the Pixel/Textureformats.

If anybody can shed a light on this… would be great.

Fetching depth is done in ARSessionNative.mm, same way as the video buffers:

- (void)session:(ARSession *)session didUpdateFrame:(ARFrame *)frame

{
   ..... at the end of the method add...

    if (frame.capturedDepthData == NULL) {
        NSLog(@"no capturedDepthData");
    } else {

        CVPixelBufferRef pixelBufferDepth = frame.capturedDepthData.depthDataMap;

        if (pixelBufferDepth != NULL) {
            size_t imageDepthWidth = CVPixelBufferGetWidth(pixelBufferDepth);
            size_t imageDepthHeight = CVPixelBufferGetHeight(pixelBufferDepth);

            if (s_UnityPixelBuffers.bEnable)
            {
                CVPixelBufferLockBaseAddress(pixelBufferDepth, kCVPixelBufferLock_ReadOnly);

                if (s_UnityPixelBuffers.pDepthPixelBytes)
                {
                    unsigned long numBytes = CVPixelBufferGetBytesPerRowOfPlane(pixelBufferDepth, 0) * CVPixelBufferGetHeightOfPlane(pixelBufferDepth,0);

                    void* baseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBufferDepth,0);
                    memcpy(s_UnityPixelBuffers.pDepthPixelBytes, baseAddress, numBytes);
                }
                CVPixelBufferUnlockBaseAddress(pixelBufferDepth, kCVPixelBufferLock_ReadOnly);
            }

            // textureCbCr
            id<MTLTexture> textureDepth = nil;
            {
                const size_t width = CVPixelBufferGetWidthOfPlane(pixelBufferDepth, 0);
                const size_t height = CVPixelBufferGetHeightOfPlane(pixelBufferDepth, 0);

 // WHAT IS THE CORRECT FORMAT???
 MTLPixelFormat pixelFormat = MTLPixelFormatR8Unorm;

                CVMetalTextureRef texture = NULL;
                CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, _textureCache, pixelBufferDepth, NULL, pixelFormat, width, height, 0, &texture);

                if(status == kCVReturnSuccess)
                {
                    textureDepth = CVMetalTextureGetTexture(texture);
                }

                if (texture != NULL)
                {
                    CFRelease(texture);
                }
            }

            if (textureDepth != nil ) {
                dispatch_async(dispatch_get_main_queue(), ^{
                    s_CapturedImageTextureDepth = textureDepth;
                });

            }
        } else {
            // NSLog(@"no depthDataMap");
        }
    }

}

So, what is the correct Format here?

And on the Unityside to crate the texture in UnityARVideo:

 public void OnPreRender()
        {
           // ... insert at the end
        // Texture Depth
            if (_videoTextureDepth == null) {
        // Depth size is different from video texture width = 640, height = 360

// What is the correct Textureformat here???
              _videoTextureDepth = Texture2D.CreateExternalTexture(640, 360,
                 TextureFormat.RGBA32, false, false, (System.IntPtr)handles.TextureDepth);
              _videoTextureDepth.filterMode = FilterMode.Bilinear;
              _videoTextureDepth.wrapMode = TextureWrapMode.Repeat;
              m_ClearMaterial.SetTexture("_textureMask", _videoTextureDepth);
            }

            _videoTextureDepth.UpdateExternalTexture(handles.TextureDepth);
        }

            _videoTextureY.UpdateExternalTexture(handles.TextureY);
            _videoTextureCbCr.UpdateExternalTexture(handles.TextureCbCr);

            m_ClearMaterial.SetMatrix("_DisplayTransform", _displayTransform);
        }

You should not try to simply copy and paste the above code snippets into your version. Thats not going to work anyways, because more steps need to be done to hook everything up. Moreover the ARKit Unity Plugin is Unity intellectual property, so i guess i can anyways not distribute the current solution - unless somebody from Unity allows me to do so.

This post is really only for people knowing about Pixel and Textureformats - and admittedly this more on the Apple/Ios Side.

Some progress:
Set the format for depth in ARSessionNativ.mm to
MTLPixelFormat pixelFormat = MTLPixelFormatR32Float;

in ARVideo.cs set it to TextureFormat.RGBA32

This way i get an image in correct proportions, undistorted. Strangly enough it is nit grayscale, but rather redscale (everything red but with darker red for depth - means black/dark red in front, pure red in background.).

I can use that to create a mask. Unfortunately it is rather blocky around the edges - but so what. Can be improved later.

Hello Berlin.
You got right the MTLPixelFormat is R32Float and you can use UnityEngine.TextureFormat.RFloat.
But the values are not between 0 and 1, i think the values are in meters ;).