Hi to everyone!
I’d like to discuss with you about the possibility to implement a co-location system for multiplayer mixed reality apps running on multiple Meta Quest 3 devices sharing the same room.
I know the Meta own SDK has some proprietary features for this purpose like Spatial Anchors, but since there is a lot of cloud stuff to set up, I am trying to get ahead with a “custom” implementation sticking to XR Interaction Toolkit, AR Foundation and Meta OpenXR, while managing multiplayer using Unity NGO.
I started from the VR Multiplayer Template to have a solid network base for my app, and now I’m experimenting with strategies to syncronize players positions with their physical positions. To do so, I need that all users shares, host and clients, a point that can be considered equally placed in both physical and virtual space (should be used as a sort of frame of reference center).
The only thing that came to my mind for now to achieve this result, is to access Meta Quest spatial informations, retrieve ARPlanes, and make the host spawn a Networked ARAnchor to the center of a reference plane (for example the floor or the ceiling).
A client that joins the host lobby should compute the distance between its local reference plane center (ex. local floor center) and the ARAnchor position shared by the host. That distance is used to relocate the client XROrigin.
// Example snippet using floor as reference
var localToSharedDistance = arFloorPlaneLocal.center - sharedAnchor.transform.position;
xrOrigin.transform.position += localToSharedDistance;
A more complex version could use more than one single plane for reference. For example, all walls and planes recognized by the ARPlaneManager that are flagged as Floor, WallFace or Ceiling can be used to compute a sort of room centroid that the host can use to place the shared ARAnchor. The client could compute the same centroid using local arPlanes, evaluate the distance between local centroid and shared anchor and then apply that distance to the XROrigin.
Now I’m asking you some feedback, since I’m still trying to make this approach work. Have you got some advice to give? Is this logic right or am I missing something?
I figure that one problem could be the fact that this method assumes that all headsets have registered really similar real space room data (walls, floor and ceiling dimensions). Do you suggest different ways to get a shared frame of reference?
And finally, could this feature be actually implemented using these tools combo, without Meta Spatial Anchors? Or there are limitations that I missed and that does not permit it?
Thanks for reaching! Hope to open a nice discussion.