Currently, the biggest obstacle in applying eye tracking technology in Virtual Reality (VR) and Augmented Reality (AR) scenes is the particular difficulty in choosing distance and object distance in 3D scenes. In previous research, geometric calculation methods using vestibular ocular reflex (VOR) and research on binocular visual angle have been studied to some extent, but unfortunately, their effects have not reached a practical level. The paper proposes a new research idea to estimate the depth of binocular gaze using a method of time series eye movement data analysis based on depth learning, and proposes a Mix-Temporal Convolutional Network (TCN) optical time series network. By combining VOR and deep learning theory, the paper has realized the current state-of-the-art technology for estimating the depth of gaze from the movement of the gaze.
Read full abstract