Mapping Mediapipe 2D coordinates into unity 3D world

Hey everyone, I’m working on an AR app for mobile devices, and I’m using a Python server to process camera frames with MediaPipe to reduce the computational load on the mobile devices. For the AR SDK, I’m testing both Vuforia and EasyAR, but for now, I’m primarily using Vuforia. The app sends the camera frames to the server, which processes them and returns pose estimation data as a string.

The issue I’m facing is that the key joint coordinates are not mapped correctly. When I use the raw data, the key joints appear very small relative to the image. However, if I multiply the coordinates by the image’s width and height, the resulting positions fall outside the camera’s visible range in Unity.

How can I accurately map the coordinates to Unity’s world space?