Abstract

In the future manned lunar exploration mission, astronauts would work with the lunar robots, which has a high requirement for human–robot interaction (HRI). As the accuracy of gesture recognition interaction does not fulfill the requirement for human–robot joint exploration missions, we propose the DMS-SK/BLSTM-CTC hybrid network to improve the performance of HRI. For gesture recognition, considering VGG-SK has low accuracy and complex architecture, we delete the fourth convolution module, optimize the last global pooling layer, introduce dilated convolution block and multiscale convolution block in VGG-SK, and get the DMS-SK-based gesture recognition sub-network. Compared with the traditional recognition methods, the accuracy and performance of DMS-SK improve. For speech recognition, considering that Bidirectional long–short-term memory unit (BLSTM) has the advantages of processing temporal information, and the Connectionist Temporal Classification (CTC) algorithm can simplify speech data preprocessing, we use BLSTM based on CTC as the speech recognition sub-network. Finally, we combine DMS-SK with BLSTM-CTC, and propose the DMS-SK/BLSTM-CTC hybrid network as the gesture/speech hybrid network. In addition, we use 10 gestures in the American Sign Language (ASL) dataset and 10 speech commands to construct the gesture/speech hybrid dataset. Experimental results show that compared with the pure gesture or pure speech networks, the recognition accuracy of the gesture-speech hybrid network improves by 2% and 12%, respectively, its accuracy reaches 97.38%, which fulfills the requirement of astronauts for HRI.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call