Fully Immersive UI using XR Interaction Toolkit

I’m currently working through adapting the UI of an existing VR application to work with XRI and the Vision Pro. Our previous interactions were built using Poke Interactions. I attempted using XRI to implement a simple button press interface, but it seems the Gaze + Pinch interaction (clicking in the simulator with the left-most option selected) wouldn’t actually select buttons.

While exploring this, I was able to get the “HandsDemoScene” working on the simulator (by adding an AR Session to the scene), but currently there’s no way to test hand interaction in the simulator. What’s the recommended best practice for building UIs if we don’t have access to a device? Trying to prep for a Developer Lab session, but nervous we won’t have a UI ready without some set of best practices.

I would like to know this too. I haven’t been able to get any input to register in the simulator. Have tried some of the XRI samples, and while I can navigate around the scene and control the camera, clicks in the simulator don’t seem to do anything

1 Like

IINM in the simulator you can do “Click” and “Pinch/Hold” with the left mouse button. Not sure if that is what you need but I think that’s the extent of what the simulator may provide.

The visionOS Template shows both actions and how they work in the simulator.

You’re saying input works for you with App Mode set to Virtual Reality - Fully Immersive? I have run the Sample Scenes from the visionOSTemplate 0.4.3 in Fully Immersive and cannot get left mouse button input to cause anything to happen. With App Mode set to Mixed Reality, I can grab objects and canvas UI elements are highlighted when rolled over but click doesn’t do anything

So clicking the expand view button in the template doesn’t do anything for you?

I’m not totally sure this applies to the simulator, but there was chatter in another thread about Gaze interators not working until the next release. see here:

Correct, it doesn’t do anything

So clicking the expand view button in the template doesn’t do anything for you?

Correct, it doesn’t do anything in Fully Immersive (VR) mode.

Input in VR is not functional in 0.4.x and below. It will be available in our next release.

As for using XRI, we provide a sample scene called XRIDebug that shows how to set up our PolySpatial-specific XRTouchSpaceInteractor. Awkwardly, the input action map is missing its PrimaryWorldTouch action, and the script is still set up to use the now-obsolete WorldTouchState struct. This shouldn’t be a huge issue, since you can still set things up manually. It may also be possible now that the gaze ray is coming through the input system to use a RayInteractor but I can’t confirm if that is working at the moment.

We may not be able to get the samples fixed for the next release, but we should be able to push a point release shortly after that fixes all of this up. Hopefully the slightly broken XRIDebug scene is enough to get you started.

1 Like

This is expected. The template is set up assuming you are building to MR, so that “expand view” button is for transitioning from a shared space to fully immersive MR. You’re already in a fully immersive space in VR mode, and we don’t use the VolumeCamera component for anything, so that button won’t have any effect.

I see. Does it mean in the next release XR Grab Interactor / XR Grab Interactable from XRI will work on Vision Pro?

I think you mean XR Direct Interactor for that first one? There is no Grab Interactor.

We had some trouble with Direct Interactor due to the way input works on visionOS. The Direct Interactor expects that you have a continuously tracked controller or hand pose which can update its position ahead of the interaction. So when your select/activate input comes along, your interactor is already overlapping with the interactable. In the case of visionOS, device position is only provided on the same frame that you pinch/poke, so the overlap test can be inconsistent.

We recommend that you use the XRTouchSpaceInteractor. Some of this will be cleaned up in our next release, but we won’t have a 100% working sample. Stay tuned… I’ll update this thread when everything is ready.

Any update on this? Should we be able to use XRRayInteractor? XRTouchSpaceInteractor? Should these work in the simulator or only on device? Should these work with canvas UI elements? Are there any samples demonstrating user input in VR/Fully immersive apps? Examples that show using pinch to click buttons?

Hey there! Sorry for the delay on this. We’re not quite done with our XRI updates, but you should be able to find enough examples to get you started. Make sure you’re using the latest version (0.6.3) of the PolySpatial packages.

Yes. If you look at the package samples for com.unity.xr.visionos, you should see a GameObject named XRI with a ray interactor and a pair of grab interactors that is set up for use with VisionOSSpatialPointerDevice. This can be used to interact with XRI objects in Virtual Reality apps.

Yes. There is an example of how to use this in the PolySpatial package samples. If you import the Unity PolySpatial Samples from com.unity.polyspatial in the package manager UI, you should see a scene called XRIDebug. This scene includes a properly configured XRTouchSpaceInteractor and two grab interactables.

These interactions are based on the gaze/pinch gesture, which works in the visionOS simulator and on a Vision Pro device. They are also “simulated” in play mode in the Unity Editor. You must have the PolySpatial Runtime enabled for XRTouchSpaceInteractor/mixed reality input simulation, and you must add a VisionOSPlayModeInput component to your scene (any object will do) for VR input simulation (for use with XRRayInteractor).

No, these interactors do not enable interaction with canvas UI, a.k.a. UGUI. At least, I haven’t tested them with the XRUIInputModule. We have a separate input solution for hooking up gaze/pinch input to UI canvases in Mixed Reality. We recently discovered a bug with canvas input in Mixed Reality. The UI objects must be in view of the Main Camera, even though that camera is not used for rendering. For Virtual Reality, we still need to do some testing, but it might “just work” already if you use XRUIInputModule.

If you’re having trouble with canvas input, let’s start a new thread about that, since it’s a different problem than enabling XRI. In fact, you may want to take a look at this existing thread about canvas UI in Mixed Reality.

Yep! See my comment earlier about package samples in com.unity.xr.visionos. Canvas UI is a notable exception, but that should get you started with XRI. And, of course, you can set up input system actions bound to VisionOSSpatialPointerDevice controls and write C# code to process gaze/pinch input. You have access to the gaze ray on the first input event, and a “device position”, which is the location in world space where your thumb and index finger meet. You can also rotate your hand which will update “device rotation.” Since there are no RealityKit objects for the gaze ray to intersect with, there is no “interaction position” like you have in SpatialPointerDevice for Mixed Reality.

If we’re talking about UGUI/Canvas UI buttons, no. As I said earlier, as long as the buttons are visible to the main camera, they should “just work” in mixed reality, and they might work in Virtual Reality if you use an XRRayInteractor with a XRUIInputModule. We’ll have more sample content covering this, but again I’ll direct you to this thread about canvas UI for Mixed Reality.

1 Like

Thanks for the info. The VR sample was very helpful. Have input working in our app in the simulator now including world canvas controls. Not getting device tracking info except on pinch is going to require rethinking some of our UX though. Is it still expected this is how things will work on the shipping device?

Yes. The constraint around only getting eye gaze data on the first pinch event is an intentional design decision made by Apple to protect user privacy. I am not aware of any plans for this to change. Please share this feedback with Apple.

Shared this feedback. Was able to test on device and things worked well.


Wanted to add our thanks for the samples, they have helped out a lot.

I understand the privacy restrictions around the gaze, but the spatial pointer position/rotation also only update when pinch is active. Since we can get this same information more-or-less through the XRHands package without that constraint, it’s less clear why these can’t provide constant tracking. Is this also a limitation on Apple’s side that we should provide feedback to them on?

One major difference is that ARKit hand tracking data is only available to the app when it is in the foreground with an immersive space open. This is always the case for a Unity VR build, but the gaze/pinch interaction is available in the mixed reality shared space, as well.

Furthermore, the gaze/pinch interaction is usually bound to an interaction with some object. In mixed reality applications, if you pinch your fingers without any objects around, or if you aren’t gazing at an entity with a collider, no events are sent, and you can’t access pinch position/rotation. VR apps are kind of an exception, since you aren’t expected to use RealityKit, so there are no colliders for the gaze ray to hit. You can learn more about this in Apple’s documentation for SpatialEventGesture.

Of course you’re always welcome to share your feedback with Apple, but, in my opinion, this is a reasonable way for them to have designed these APIs. One is for interacting with UI and RealityKit entities, and one is for expressing data about the user. Something I wish they had was a handedness property on the spatial event, which would help you connect a gesture to the tracked hand skeleton. Alternatively, ARKit could expose a “pinch strength” and “aim pose” along with the skeletal data, similar to what is provided by OpenXR. At the moment, ARKit hands and SpatialEventGesture are completely separate systems that can’t be easily tied together.

Yeah, handedness data would help out. We are a VR fully immersive app and are currently working on implementing hand-based ray interaction to get around the hover limitations, and trying to figure out the best way to approach that (while still supporting gaze for simulator testing).

Appreciate the response!