Abstract
With the development of technologies such as computer vision and speech recognition, human-computer interaction technology has gradually matured and has a broad development prospect in many fields. In this paper, a multimodal human-computer interaction system based on Bayesian classification algorithm is constructed, which collects human posture and speech information of the interactor during the interaction process, and determines the interaction intention by the information of three modalities: finger direction, face orientation and speech of the interactor, without touching or wearing any equipment during the whole interaction process. The head pose estimation method is introduced and used as the face orientation modality for the multimodal interaction system; meanwhile, the DBSCAN (density-based spatial clustering) algorithm is used to select the interaction frames so as to filter out the unstable interaction frames and improve the system robustness. In addition, a multimodal data fusion algorithm based on Bayes' theorem is proposed in this paper to improve the interaction accuracy. Experiments show that the proposed multimodal HCI system has better robustness and higher accuracy than the single-modal interaction and decision matrix fusion.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have