SDViT: Stacking of Distilled Vision Transformers for Hand Gesture Recognition

Chun Keat Tan,Chin Poo Lee,Kian Ming Lim,Roy Kwang Yang Chang,Ali Alqahtani

doi:10.3390/app132212204

Chun Keat Tan, Chin Poo Lee + Show 3 more

Open Access

https://doi.org/10.3390/app132212204

Copy DOI

Journal: Applied Sciences	Publication Date: Nov 10, 2023
Citations: 2	License type: CC BY 4.0

Affiliation: Multimedia University, King Khalid University

Abstract

Hand gesture recognition (HGR) is a rapidly evolving field with the potential to revolutionize human–computer interactions by enabling machines to interpret and understand human gestures for intuitive communication and control. However, HGR faces challenges such as the high similarity of hand gestures, real-time performance, and model generalization. To address these challenges, this paper proposes the stacking of distilled vision transformers, referred to as SDViT, for hand gesture recognition. An initially pretrained vision transformer (ViT) featuring a self-attention mechanism is introduced to effectively capture intricate connections among image patches, thereby enhancing its capability to handle the challenge of high similarity between hand gestures. Subsequently, knowledge distillation is proposed to compress the ViT model and improve model generalization. Multiple distilled ViTs are then stacked to achieve higher predictive performance and reduce overfitting. The proposed SDViT model achieves a promising performance on three benchmark datasets for hand gesture recognition: the American Sign Language (ASL) dataset, the ASL with digits dataset, and the National University of Singapore (NUS) hand gesture dataset. The accuracies achieved on these datasets are 100.00%, 99.60%, and 100.00%, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SDViT: Stacking of Distilled Vision Transformers for Hand Gesture Recognition

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

HGR-ViT: Hand Gesture Recognition with Vision Transformer
Chun Keat Tan ... Roy Kwang Yang Chang
Sensors | VOL. 23
Chun Keat Tan, et. al.Chun Keat Tan ... Roy Kwang Yang Chang
14 Jun 2023
Sensors | VOL. 23

A Deep Learning-Based End-to-End Composite System for Hand Detection and Gesture Recognition.
Adam Ahmed Qaid Mohammed ... Jiancheng Lv
Sensors | VOL. 19
Adam Ahmed Qaid Mohammed, et. al.Adam Ahmed Qaid Mohammed ... Jiancheng Lv
30 Nov 2019
Sensors | VOL. 19

Hand Gesture Localization and Classification by Deep Neural Network for Online Text Entry
Shivraj Sharma ... M.K Bhuyan
-
Shivraj Sharma, et. al.Shivraj Sharma ... M.K Bhuyan
07 Oct 2020
07 Oct 2020

Hand gesture recognition via enhanced densely connected convolutional neural network
Yong Soon Tan ... Chin Poo Lee
Expert Systems with Applications | VOL. 175
Yong Soon Tan, et. al.Yong Soon Tan ... Chin Poo Lee
04 Mar 2021
Expert Systems with Applications | VOL. 175

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SDViT: Stacking of Distilled Vision Transformers for Hand Gesture Recognition

Abstract

Talk to us

Similar Papers

More From: Applied Sciences