Skip to article frontmatterSkip to article content

Hand Tracking

How Hand Tracking Works in WebXR

In web XR, the hand is represented by XRHand element. This XRHand is an ordered map where the keys are the hand joints and the values an XRJointSpace.

There is 25 entries in this ordered map, which are :

Hand joints indexes

Figure 1:Hand joints indexes

IndexNom de l’articulation
0wrist
1thumb-metacarpal
2thumb-phalanx-proximal
3thumb-phalanx-distal
4thumb-tip
5index-finger-metacarpal
6index-finger-phalanx-proximal
7index-finger-phalanx-intermediate
8index-finger-phalanx-distal
9index-finger-tip
10middle-finger-metacarpal
11middle-finger-phalanx-proximal
12middle-finger-phalanx-intermediate
13middle-finger-phalanx-distal
14middle-finger-tip
15ring-finger-metacarpal
16ring-finger-phalanx-proximal
17ring-finger-phalanx-intermediate
18ring-finger-phalanx-distal
19ring-finger-tip
20pinky-finger-metacarpal
21pinky-finger-phalanx-proximal
22pinky-finger-phalanx-intermediate
23pinky-finger-phalanx-distal
24pinky-finger-tip

Get Hands

Hands are inputs sources, we can access it through several manipulations :

  1. First, we need to get the XRHandState. To get it we use the useXRInputSourceState hook.
const handSourceRight = useXRInputSourceState("hand", "right");
const handSourceLeft = useXRInputSourceState("hand", "left");
  1. After we retrieve the hand from the XRHandInputSource accessible through .inputSource property of the XRHandState.
const right = handSourceRight.inputSource.hand;
const left = handSourceLeft.inputSource.hand;

Get Fingers

Now we have our hands, but we also want to access to our fingers and especially their positions.

Let’s get the thumb tip and the index finger tip. To do this we need 3 things :

On the XRFrame instance, we should have a getJointPose method wo return the space of our joint (XRJointSpace) which contains orientation and position of it.

Be careful, getJointPose can be undefined

const thumbTip = hand.get("thumb-tip");
const indexTip = hand.get("index-finger-tip");

const thumbPose = frame.getJointPose(thumbTip, referenceSpace);
const indexPose = frame.getJointPose(indexTip, referenceSpace);

Then on our fingers pose we have the .transform property which give us the XRRigidTransform which let us access the .position. But .position does not return a Vector3 (which is specific to ThreeJS), it returns a DOMPointReadOnly which need to be converted to a Vector3.

const thumbPos = DOMPointReadOnlyToVector3(thumbPose.transform.position);
const indexPos = DOMPointReadOnlyToVector3(indexPose.transform.position);

Function to convert DOMPointReadOnly to Vector3

function DOMPointReadOnlyToVector3(entry: DOMPointReadOnly) {
  return new THREE.Vector3(entry.x, entry.y, entry.z);
}

Detect a pinch

Now we know all those things we can make a function to detect a pinch :

It needs the XRHand we want to test for the pinch, the XRFrame and the XRReferenceSpace.

/**
 * Detect if the hand make a pinch
 * @param hand - XRHand
 * @param frame - XRFrame
 * @param referenceSpace - ReferenceSpace
 * @param threshold under this distance (in meter) it's detected. Default = 0.025 (2.5 cm)
 * @returns boolean
 */
export function isPinching(
  hand: XRHand | undefined,
  frame: XRFrame | undefined,
  referenceSpace: XRReferenceSpace | undefined,
  threshold: number = 0.025
): boolean {
  if (!(hand && frame && frame.getJointPose && referenceSpace)) return false;

  const thumbTip = hand.get("thumb-tip");
  const indexTip = hand.get("index-finger-tip");

  if (!thumbTip || !indexTip) {
    return false;
  }

  const thumbPose = frame.getJointPose(thumbTip, referenceSpace);
  const indexPose = frame.getJointPose(indexTip, referenceSpace);

  if (!thumbPose || !indexPose) {
    return false;
  }

  const thumbPos = DOMPointReadOnlyToVector3(thumbPose.transform.position);
  const indexPos = DOMPointReadOnlyToVector3(indexPose.transform.position);

  const distance = thumbPos.distanceTo(indexPos);
  return distance < threshold;
}

Hand Detection Module

HandState is the module where we developed the hand tracking, including function to detect hand actions and an event dispatcher.

Gesture Detection Functions

isPinching(hand, frame, referenceSpace, threshold?): boolean

Detects if the hand is performing a pinch gesture between thumb and index finger.

Parameters:

Logic: Calculates the distance between thumb tip and index finger tip. If below threshold, considers it a pinch.

isPinchingMiddle(hand, frame, referenceSpace, threshold?): boolean

Detects pinching between thumb and middle finger.

Parameters: Same as isPinching

Usage: Useful for gesture differentiation or alternative commands.

isOpenHand(hand, frame, referenceSpace, threshold?): boolean

Detects if the hand is open by measuring average distance between palm and fingertips.

Parameters:

Logic:

isCloseHand(hand, frame, referenceSpace, threshold?): boolean

Detects if the hand is closed. Unlike !isOpenHand(), this function checks that the hand is defined to prevent false positives.

Main Class HandState

Types and Interfaces

type HandActionEvents = "pinch" | "pinch-middle" | "opened" | "closed";

interface HandActionEvent {
  hand: XRHand;
  side: "left" | "right";
}

Update Method

Role: Main method to call each frame to detect gestures and emit events.

Detection Logic:

  1. For each hand (right and left) and for each gesture type (pinch, pinch-middle, opened, closed)
  2. Check current gesture state
  3. Compare with previous state
  4. If transition from false to true, emit event
  5. Update previous state

Events Emitted:

Usage Example

// Initialization
const handState = new HandState({
  rightHand: xrInputSource.hand,
  leftHand: xrInputSource.hand,
  pinchThreshold: 0.03, // 3cm
});

// Event listening
handState.addEventListener("pinch", (event) => {
  console.log(`Pinch detected on ${event.side} hand`);
});

handState.addEventListener("opened", (event) => {
  console.log(`${event.side} hand opened`);
});

// In render loop
useFrame((_, __, frame) => {
  handState?.update(frame, referenceSpace);
});

All the source code is available here