Dynamic gesture recognition based on CNN-LSTM-Attention

Jinwei Liu,Baoguo Wei,Yong Xu,Mingzhi Cai

doi:10.1109/icspcc52875.2021.9565034

Abstract

Compared with traditional human-computer interaction techniques, gesture recognition is closer to human expression habits and have some advantages of being efficient and easy to master. Vision-based gesture recognition does not require additional equipment, and is very convenient and relatively low cost. To recognize dynamic gesture in complex background, we build a backbone network based on SSD with dilated convolution, which greatly improves the quality of the detected feature maps, and then we proposes a CNN-LSTM-Attention based dynamic gesture recognition network. The spatial features of dynamic gestures at each moment are first extracted from gesture sequences, then these features are transformed into dynamic gesture spatio-temporal features by a recurrent neural network with an attention mechanism, and finally fed into a fully connected neural network for gesture recognition. The dynamic gesture recognition network achieves 93.5% recognition rate on Sahand dataset, which exhibits its effectiveness.

Full Text