Hand gestures based on human-computer interaction are both intuitive and versatile, with multiple and diverse applications including in smart homes, games, operating theaters and vehicle infotainment systems. This research presents a novel architecture by combining a convolutional neural network (CNN) and traditional feature extractors to examine the accuracy of static hand gesture recognition. This research provides three significant contributions. First, we use the Non-Dominated Sorting Genetic Algorithm II (NSGAII), an evolutionary algorithm to classify and select image features across five methods, including the Gabor filter, the Hu-moment, the Zernike moment, the Complex moment, and the Fourier moment. Experimental results demonstrated that the combination of the Gabor filter, the Hu moment, and the Zernike moment achieved the best result with an accuracy of 98.3% to 99.0%. The Zernike moment combined with the Hu-moment output had an accuracy of 95.5% to 98.0%. The second contribution proposes the use of the Multiple Feature Convolutional Neural Network (MFCNN) model to generate better image recognition through the combination of validation techniques and features descriptors. Extensive experimentation was conducted utilizing binary and grayscale, as well as two different validation techniques - the Holdout technique and the Cross-validation of leaving one subject out of the validation. The proposed architecture was evaluated on two dataset types and is compared with the state-of-the art convolutional neural networks (CNN). The Massey’s dataset, contained 2,524 images and 36 gestures, and the OUHANDs dataset contained 3,000 images and 10 gestures. Experimental results demonstrated a high recognition rate using descriptors with low computational cost and reduced size. The third contribution is the sequence sentences generation based on the Beam Search (BS) algorithm. The data obtained from CNN/Daily Mail documents and results of image recognition, i.e., the image’s label, were used to test various question size with four different sizes of questions, including 100, 1,000, 10,000, and 40,000. The experimental results showed that our method could achieve high-quality sentence generation.
Read full abstract