Abstract

Compared with traditional human-computer interaction techniques, gesture recognition is closer to human expression habits and have some advantages of being efficient and easy to master. Vision-based gesture recognition does not require additional equipment, and is very convenient and relatively low cost. To recognize dynamic gesture in complex background, we build a backbone network based on SSD with dilated convolution, which greatly improves the quality of the detected feature maps, and then we proposes a CNN-LSTM-Attention based dynamic gesture recognition network. The spatial features of dynamic gestures at each moment are first extracted from gesture sequences, then these features are transformed into dynamic gesture spatio-temporal features by a recurrent neural network with an attention mechanism, and finally fed into a fully connected neural network for gesture recognition. The dynamic gesture recognition network achieves 93.5% recognition rate on Sahand dataset, which exhibits its effectiveness.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.