Abstract

Aiming at the problem of the absence of detail texture and other high-frequency features in the feature extraction process of the deep network employing the upsampling operation, the accuracy of gesture recognition is seriously affected in complex scenes. This study integrates object detection and gesture recognition into one model and proposes a gesture detection and recognition based on the pyramid frequency feature fusion module and multiscale attention in human-computer interaction. Pyramid fusion module is used to perform efficient feature fusion and is proposed to obtain feature layers with rich details and semantic information, which is helpful to improve the efficiency and accuracy of gesture recognition. In addition, the multiscale attention module is further adopted to adaptively mine important and effective feature information from both temporal and spatial channels and embedded into the detection layer. Finally, our proposed network realizes the enhancement of the effective information and the suppression of the invalid information of the detection layer. Experimental results show that our proposed model makes full use of the high-low frequency feature fusion module without replacing the basic backbone network, which can greatly reduce the computational overhead while improving the detection accuracy.

Highlights

  • With the rapid development of science and technology, the interaction between humans and machines has appeared in more scenarios [1, 2]. e user’s requirements for the friendliness, usability, and high-efficiency of humancomputer interaction methods have been further improved [3]

  • In order to improve the accuracy of gesture recognition as much as possible with less computational overhead and avoid the limitations and shortcomings of existing gesture recognition methods, a feature fusion feature fusion module based on the feature pyramid network to perform efficient feature fusion is proposed to obtain feature layers with rich details and semantic information

  • Our proposed network further realizes the enhancement of the effective information and the suppression of the invalid information of the detection layer. erefore, this study makes full use of the feature information between the detection layers without replacing the basic backbone network, which can greatly reduce the computational overhead while improving the detection accuracy

Read more

Summary

Introduction

With the rapid development of science and technology, the interaction between humans and machines has appeared in more scenarios [1, 2]. e user’s requirements for the friendliness, usability, and high-efficiency of humancomputer interaction methods have been further improved [3]. Feature fusion Single Shot Multibox detector generates a large-scale feature by fusing multiple shallow feature layers with different scales to generate a large-scale feature and constructs a new feature pyramid for detection by downsampling on this large-scale feature layer These deep methods have effectively improved the accuracy of traditional gesture recognition algorithms, their complex feature fusion methods have greatly reduced the detection speed [19]. In order to improve the accuracy of gesture recognition as much as possible with less computational overhead and avoid the limitations and shortcomings of existing gesture recognition methods, a feature fusion feature fusion module based on the feature pyramid network to perform efficient feature fusion is proposed to obtain feature layers with rich details and semantic information. Our proposed network further realizes the enhancement of the effective information and the suppression of the invalid information of the detection layer. erefore, this study makes full use of the feature information between the detection layers without replacing the basic backbone network, which can greatly reduce the computational overhead while improving the detection accuracy

Feature Pyramid Network
Experimental Results and Analysis
Experiment Configuration and Parameter Setting
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call