With the deep integration of psychology and artificial intelligence technology and other related technologies, eye control technology has achieved certain results at the practical application level. However, it is found that the accuracy of the current single-modal eye control technology is still not high, which is mainly caused by the inaccurate eye movement detection caused by the high randomness of eye movements in the process of human–computer interaction. Therefore, this study will propose an intent recognition method that fuses facial expressions and eye movement information and expects to complete an eye control method based on the fusion of facial expression and eye movement information based on the multimodal intent recognition dataset, including facial expressions and eye movement information constructed in this study. Based on the self-attention fusion strategy, the fused features are calculated, and the multi-layer perceptron is used to classify the fused features, so as to realize the mutual attention between different features, and improve the accuracy of intention recognition by enhancing the weight of effective features in a targeted manner. In order to solve the problem of inaccurate eye movement detection, an improved YOLOv5 model was proposed, and the accuracy of the model detection was improved by adding two strategies: a small target layer and a CA attention mechanism. At the same time, the corresponding eye movement behavior discrimination algorithm was combined for each eye movement action to realize the output of eye behavior instructions. Finally, the experimental verification of the eye–computer interaction scheme combining the intention recognition model and the eye movement detection model showed that the accuracy of the eye-controlled manipulator to perform various tasks could reach more than 95 percent based on this scheme.
Read full abstract