We explore the use of multimodal input to predict the landing position of a ray pointer while selecting targets in a virtual reality (VR) environment. We first extend a prior 2D Kinematic Template Matching technique to include head movements. This new technique, Head-Coupled Kinematic Template Matching, was found to improve upon the existing 2D approach, with an angular error of 10.0° when a user was 40% of the way through their movement. We then investigate two additional models that incorporated eye gaze, which were both found to further improve the predicted landing positions. The first model, Gaze-Coupled Kinematic Template Matching resulted in angular error of 6.8° for reciprocal target layouts and 9.1° for random target layouts, when a user was 40% of the way through their movement. The second model, Hybrid Kinematic Template Matching, resulted in angular error of 5.2° for reciprocal target layouts and 7.2° for random target layouts when a user was 40% of the way through their movement. We also found that using just the current gaze location resulted in sufficient predictions in many conditions. We reflect on our results by discussing the broader implications of utilizing multimodal input to inform selection predictions in VR.
Read full abstract