Abstract
Most of the available American Sign Language (ASL) words share similar characteristics. These characteristics are usually during sign trajectory which yields similarity issues and hinders ubiquitous application. However, recognition of similar ASL words confused translation algorithms, which lead to misclassification. In this paper, based on fast fisher vector (FFV) and bi-directional Long-Short Term memory (Bi-LSTM) method, a large database of dynamic sign words recognition algorithm called bidirectional long-short term memory-fast fisher vector (FFV-Bi-LSTM) is designed. This algorithm is designed to train 3D hand skeletal information of motion and orientation angle features learned from the leap motion controller (LMC). Each bulk features in the 3D video frame is concatenated together and represented as an high-dimensional vector using FFV encoding. Evaluation results demonstrate that the FFV-Bi-LSTM algorithm is suitable for accurately recognizing dynamic ASL words on basis of prosodic and angle cues. Furthermore, comparison results demonstrate that FFV-Bi-LSTM can provide better recognition accuracy of 98.6% and 91.002% for randomly selected ASL dictionary and 10 pairs of similar ASL words, in leave-one-subject-out cross-validation on the constructed dataset. The performance of our FFV-Bi-LSTM is further evaluated on ASL data set, leap motion dynamic hand gestures data set (LMDHG), and Semaphoric hand gestures contained in the Shape Retrieval Contest (SHREC) dataset. We improve the accuracy of the ASL data set, LMDHG, and SHREC data sets by 2%, 2%, and 3.19% respectively.
Highlights
T HE incredible attention in human-computer interaction (HCI) makes human hands the most natural and efficient medium to express intentions for daily interaction activities [1]. It leads to the development of numerous HCI systems such as sign language recognition, robotics, medical diagnostics, among others
Motivated by [5], [6], we present 3D Spatio-temporal skeletal hand joint features according to the prosodic model and orientation angle to address misclassification of highly correlated American Sign Language (ASL) words
We further evaluate our method on Semaphoric hand gestures contained in the Shape Retrieval Contest (SHREC) [64], ASL Data set [6], and Leap motion dynamic hand gestures (LMDHG) [63] Data set, respectively
Summary
T HE incredible attention in human-computer interaction (HCI) makes human hands the most natural and efficient medium to express intentions for daily interaction activities [1]. It leads to the development of numerous HCI systems such as sign language recognition, robotics, medical diagnostics, among others. American Sign Language (ASL) is one of the famous sign languages with unwritten grammar characterized by hand motions, and sometimes facial/body signs [3]. This language involves constructing very complex grammatical structures, using dynamic word gestures. ASL comprises over ten thousand dynamic word gestures with approximately 65% and 35% represented
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.