Hand gesture recognition is a significant problem for human–computer interaction. One form of hand gesture recognition is static hand gestures. This study developed a static hand gesture recognition system, consisting of three modules: Feature extraction Module, Processing Module, and Classification Module. The feature extraction module uses human pose estimation with a top-down method to extract not only the keypoints but also body and hand bounding boxes. After being normalized and processed in the processing module, its output will be used as the input for the classification module in which we proposed an architecture called Two-pipeline architecture. In this module, we also employ different methods to find the most suitable one for this task. Experiments were conducted on three datasets called HANDS, OUHANDS, and SHAPE. Results showed that the proposed Two-pipeline architecture with 2.5 million parameters obtained accuracy of 94%, 98%, and 94% on three datasets. In addition, the lightweight version with 0.22 million parameters also achieved accuracy of 91%, 94%, and 96%.
Read full abstract