Abstract

Sign language recognition (SLR) is an effective solution to communication barriers experienced by hearing and vocally impaired individuals with other communities. Its applications extend to human–robot interaction (HRI), virtual reality (VR), and augmented reality (AR). However, the diverse nature of sign languages, stemming from varying user habits and geographical regions, poses significant challenges. To address these challenges, we propose the Skeleton-based Multi-feature Learning method (SML). This method comprises a Multi Feature Aggregation (MFA) module, designed to capture the inherent relationships between different skeleton-based features, enabling effective fusion of complementary information. Furthermore, we propose the Self knowledge distillation Guided Adaptive Residual Decoupled Graph Convolutional Network (SGAR-DGCN) for feature extraction. SGAR-DGCN consists of three components: a Self Knowledge Distillation (SKD) mechanism to enhance model training, convergence, and accuracy; a DGCN-Block, incorporating Decoupled GCN and Spatio Temporal Channel attention (STC) for efficient feature extraction; and an Adaptive Residual Block (ARes-Block) for cross-layer information fusion. Experimental results demonstrate that our SML method outperforms state-of-the-art approaches on the WLASL (55.85%) and AUTSL (96.85%) datasets, solely utilizing skeleton data. Code is available at https://github.com/DzwFine37/SML.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.