I am attempting to process the camera image through some AI models based on the XRCpuImage I am obtaining as per the documentation. I firstly change the camera configuration to 640x480, which is reported as successful. Then, I obtain the image and store it in a Texture2D after converting it.
The camera image comes out with a 90-degree left rotation. I obviously want it to be right-side up like the AR Camera feed, along with the aspect ratio being correct so the image doesn’t appear stretched. I have already created a conversion class that can rotate the image freely on any axis and return a new image, but doing it on a per-pixel basis is very slow. I was hoping there was a more optimized way I don’t know about, similar to the Transformation system in the ConversionParams XRCpuImage provides, but for rotating the image and preserving the aspect ratio of 640x480.
Note: This is an Android application forced to Portrait mode only. Also, I can ideally update the image on a 30-ish frame basis as I am utilizing an LSTM model that requires multiple frames to process a result.
If anyone has experience with this, please let me know. Thank you!
Based on your reply, I looked into the code behind the subsystem and found TryGetLatestFrame(), which provides an XRCameraFrame. This contains a projectionMatrix and displayMatrix as you stated, for each frame. I can obtain this alongside my acquisition of the XRCpuImage via TryAcquireLatestCpuImage().
Alongside this, ARCameraFrameEventArgs also provides both matrices on a per-frame basis whenever the frameReceived event is fired by the ARCameraManager, so I can conveniently check each matrix when I obtain an XRCpuImage.
I have obtained the current projectionMatrix and calculated the preferred displayMatrix, but I am still confused as to how I am meant to modify the displayMatrix before the image is displayed.
I also find it strange that very few people are talking about the issue of the XRCpuImage having an incorrect rotation, which to my understanding is an Android-related issue stemming from the physical camera itself. Am I missing something obvious? Is the expected behavior when obtaining the XRCpuImage for it to have this incorrect rotation relative to the screen orientation?
The proposed solution to this issue is much deeper than I anticipated and that seems arbitrary for a problem that seems so simple, but if I am correct in assuming that modifying the displayMatrix is the most efficient solution, please let me know how I can modify this before obtaining an XRCpuImage so I receive it in the proper orientation.
The rotation is not incorrect. As you say, the physical camera on the device is oriented perpendicular to the screen. What you get from the CPU image API is the raw camera image. If you want to rotate the pixels, you must apply your own rotation. You don’t need to modify our displayMatrix property to do so, but if you want to see how the transformation is calculated when the ARCameraBackground renders that image to the screen, we provide the matrix for you. You are welcome to rotate your CPU image pixels however you wish.
That makes sense entirely. My main concern is the performance overhead when considering rotations other than the default orientation of the camera. Obviously performing this transformation on a per-pixel basis is quite slow, and I can see this in the declining framerate. The ARCameraBackground seems to do it efficiently though, and the methodology is unclear to me in terms of how this is achieved. It seems that a matrix is used somehow, but I am unsure how this factors into the image I am receiving.
If you have any more helpful information related to this, please let me know. Otherwise, I will be researching different ways to achieve this effect smoothly.
So just to be clear, the method which the ARCameraBackground displays the camera pixels is by means of rendering the pixels on a textured quad in camera space using a projection transformation.
Maybe you can be more specific with your question? There are many ways to “rotate an image” depending on what you mean. For instance you can have a GameObject with a RawImage component, then modify that GameObject’s Transform.
Hello, I also use XRCpuImage to retrieve an image which I then process with an AI model to detect objects. However, the image I retrieve does not represent what is displayed on the screen of my Android phone. Have you found a solution? Thank you.
If you want the image that is displayed on your phone, why not just blit the image from the screen of your phone instead of repeating the entire render process a second time?