ARFoundation XRCameraImage to viewport/screenspace conversion

Hi

I am using TryGetLatestImage to get the current camera image (on Android) and run object detection on it using TFlite. This works super great and I can get bounding boxes of objects in the camera image (model runs async and in less than 100ms).
Now I would like to convert them back to screen space to show an exact bounding box. However the XRCameraImage has a different aspect ratio and a wider crop when compared to the screen resolution (screen is 2880x1440 → 1:2, camera image is configurable from 640x480 → 4:3 to 720p/1080p → 16:9). When selecting 720p for the camera resolution there is noticeably more image content around the frame which is not visible on the smartphone screen.
Is there some matrix to convert between the two reference frames?

By the way: When the camera pose (transform.position & transform.rotation) is saved in the same frame as TryGetLatestImage is called, how well will the two pieces of data line up from a temporal point of view? I know that ARCore is able to produce high accuracy timestamp for a camera capture. This could then probably be used to do some interpolation magic.
In other words: What is the most accurate way to get the device pose for a given XRCameraImage with ARFoundation?

1 Like

You can subscribe to the [ARCameraManager.frameReceived](https://docs.unity3d.com/Packages/com.unity.xr.arfoundation@latest?subfolder=/api/UnityEngine.XR.ARFoundation.ARCameraManager.html) event. The [ARCameraFrameEventArgs](https://docs.unity3d.com/Packages/com.unity.xr.arfoundation@latest?subfolder=/api/UnityEngine.XR.ARFoundation.ARCameraFrameEventArgs.html) include the displayMatrix and projectionMatrix properties that contain the matrix information for which you are looking.

hi BenjaminBachman, I am also going through the same problem. Did you found a solution for it?