Hand Posture Recognition (HPR) plays a crucial role in enabling effective human-computer interaction, particularly for individuals with hearing disabilities. The study compares five models, including MobileNetV2 96x96 0.35, MobileNetV1 96x96 0.25, MobileNetV1 96x96 0.1, self-designed Network 1, and self-designed Network 2, based on the Sbastien Marcel Static Hand Posture Database. Evaluation metrics - infserencing time, peak RAM usage, flash usage, and accuracy - are used to analyze the performance. The experiment workflow for each model comprises five major steps. Firstly, a random selection of 120 images from the Sbastien Marcel Static Hand Posture Database is converted to JPG format. Then, the images are divided into 80% training data and 20% testing data. Subsequently, the original images are normalized, and features are extracted for further processing. Subsequently, the models are individually trained using the preprocessed data, optimizing their parameters. Finally, the trained models are evaluated using the testing data set to assess their performance in hand posture recognition. The results indicate that MobileNetV2 96x96 0.35 achieves the highest accuracy of 96.69% while consuming fewer hardware resources compared to other models. MobileNetV1 96x96 0.1 demonstrates the lowest inferencing time and peak RAM usage, making it suitable for real-time applications. Furthermore, self-designed Model 1 exhibits the lowest flash usage, making it a viable option for resource-constrained devices. This study provides valuable insights into the selection of CNN architectures for HPR, offering guidance for practitioners to choose models based on specific application requirements.
Read full abstract