Learning-Based Multimodal Information Fusion and Behavior Recognition of Vascular Interventionists' Operating Skills.

Shuang Wang,Yongfeng Cao,Liang Zhao,Zheng Liu,Wentuo Yang,Le Xie

doi:10.1109/jbhi.2023.3289548

Abstract

The operating skills of vascular interventionists have an important impact on the effect of surgery. However, current research on behavior recognition and skills learning of interventionists' operating skills is limited. In this study, an innovative deep learning-based multimodal information fusion architecture is proposed for recognizing and analyzing eight common operating behaviors of interventionists. An experimental platform integrating four modal sensors is used to collect multimodal data from interventionists. The ANOVA and Manner-Whitney tests is used for relevance analysis of the data. The analysis results demonstrate that there is almost no significant difference ( p <0.001) between the actions related to the unimodal data, which cannot be used for accurate behavior recognition. Therefore, a study of the fusion architecture based on the existing machine learning classifier and the proposed deep learning fusion architecture is carried out. The research findings indicate that the proposed deep learning-based fusion architecture achieves an impressive overall accuracy of 98.5%, surpassing both the machine learning classifier (93.51%) and the unimodal data (90.05%). The deep learning-based multimodal information fusion architecture proves the feasibility of behavior recognition and skills learning of interventionist's operating skills. Furthermore, the application of deep learning-based multimodal fusion technology of surgeon's operating skills will help to improve the autonomy and intelligence of surgical robotic systems.

Full Text