XR Hands: return more than a simple bool from XRHandShape's CheckCondition method

Hello,

I am currently working on implementing a way to use desired hand gestures as “input source” for XRInteractors through the newly implemented Input Readers architecture of the XR Interaction Toolkit.
When implementing my custom InputReader, I gave a look at the code of XRHandShape’s CheckCondition method, which only returns a bool that is true if the tracked hand matches that shape. I noticed that the actual condition check is performed by the XRFingerShapeCondition’s same named method, which actually access and compare the normalized float value of the current finger shape.

That being said, I was thinking that it could be useful to retrieve the informations about those current normalized floats: for example, XRFingerShapeCondition’s CheckCondition could return (even in an out parameter in order to leave the bool return type) a value between 0 and 1 telling how much the finger is close to the desired shape; then the XRHandShape’s CheckCondition could compute and return at the same way the average of those values, telling the percentage of how much the whole hand is close to the desired XRHandShape.

In this way we can have an easy way to not just know if the gesture is performed or not, but also how close the hand is to the gesture, which could be helpful in a variety of situations (could be used for gradual animations, or as Input Reader value in my case, instead of being simply 0 or 1).

Hey @HunterProduction . You have stirred up a lot of internal discussion about this. We’ve previously discussed something similar to this, but I think understanding your use-cases will help us figure out what direction we want to go if we expose additional APIs for the data.

We’re trying to figure out if the 0 to 1 value would start at 0 once the threshold for the lower/upper tolerance range for the shape has been passed, then continue to 1 as it approaches the actual target value. Or if this is a normalized 0 to 1 based on the full curl of the finger to the target value.

Could you share a couple of use-cases where you think this would be useful? I can see pinching and gripping would be fairly straight-forward (also included by most OpenXR runtimes these days as well).

1 Like

Thank you @VRDave_Unity ! Sorry if I give you a feedback after a bunch of time, but I’m glad my suggestion was useful to your team.

To my own (humble) perspective, according to how you designed the hand shapes parameters, I think that a proper way to give us access to more than a bool value in the CheckCondition method is to provide a normalized value between 0 and 1 based on how far we are from the requested condition.
I’ll try with an example. Starting from the XRFingerShapeCondition’s CheckCondition, I’d assume that all values that satisfies the condition:
desired - lowerTolerance <= value <= desired + upperTolerance
should be considered as a satisfied condition. That being said, for a single finger the result of the CheckCondition method could be an interpolated value which equals 1 when the value checked falls inside that range, and reaches 0 when being to the opposed extremes (like a piecewise linear function “trapezoid shaped”).

After this, the XRHandShape’s CheckCondition method could provide an average of the different normalized values obtained from each finger. In this way, we could have an approximate linear interpolation of “how far we are” from the desired pose.
As a practical usage, think about a simple hand gesture visual hint to the user. We could for example interpolate the hand mesh color between two colors (let’s say red to green) to give the user a visual representation of how close he is in reaching the target hand shape for the gesture.
A more specific application I could think of, we could be able to trigger a simple “procedural animation” in the scene by “chaining” two gestures performed by the user. For example, let’s say we want a cool interactive way to let the user “recompose” an fragmented 3D model (a 3D project divided in pieces, or whatever). A first gesture sets the starting point (for example, “place your hand palm open in front of you!”). Then, we interpolate to a target second gesture (for example, “now gradually close your fist!”) to perform an particular animation in the scene (rejoin together the separated pieces of the model).

I personally think that having a normalized value designed like this it’s not really useful, not in the same way. We could always play with the hand shape thresholds and get them narrower if we want a more precise 1 detection.

Hope that this explanations could be helpful! Let me know your thoughts about this.