Abstract

Completing continuous circular capsulorhexis (CCC) requires the operator to perform fine operations, which is difficult to do accurately when continuous fine actions are out of balance in the classification of CCC procedures. Multimodal deep learning can improve the classifier’s performance, but the recognition accuracy of inferior classes is difficult to improve. To solve these problems, a bidirect-gate recurrent unit (Bi-GRU)-attention-based multimodal, multi-timescale data fusion network (BiMNet) is proposed, which contains a data extraction module called a skip-concatenate gate recurrent unit (SC-GRU), a bimodal data fusion attention computation, and a decoder module. The combination of these modules can fully extract the features of different temporal scales in multimodal action data and fuse them effectively. The model is validated using the ophthalmologist CCC multimodal maneuver dataset, which was collected by the data collection platform constructed in this research, achieving an accuracy of 0.9124 ± 0.0125 in continuous action sequence segmentation and improving the F1-score of minority class recognition to over 80%, making it more effective than baseline algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call