Abstract

Arabic Sign Language recognition is an emerging field of research. Previous attempts at automatic vision-based recog-nition of Arabic Sign Language mainly focused on finger spelling and recognizing isolated gestures. In this paper we report the first continuous Arabic Sign Language by building on existing research in feature extraction and pattern recognition. The development of the presented work required collecting a continuous Arabic Sign Language database which we designed and recorded in cooperation with a sign language expert. We intend to make the collected database available for the research community. Our system which we based on spatio-temporal feature extraction and hidden Markov models has resulted in an average word recognition rate of 94%, keeping in the mind the use of a high perplex-ity vocabulary and unrestrictive grammar. We compare our proposed work against existing sign language techniques based on accumulated image difference and motion estimation. The experimental results section shows that the pro-posed work outperforms existing solutions in terms of recognition accuracy.

Highlights

  • The growing popularity of vision-based systems has led to a revolution in gesture recognition technology

  • Our system which we based on spatio-temporal feature extraction and hidden Markov models has resulted in an average word recognition rate of 94%, keeping in the mind the use of a high perplexity vocabulary and unrestrictive grammar

  • Vision-based systems, on the other hand, provide a more natural environment within which to capture the gesture data. The flipside of this method is that working with images requires intelligent feature extraction techniques in addition to image processing techniques like segmentation which may add to the computational complexity of the system

Read more

Summary

Introduction

The growing popularity of vision-based systems has led to a revolution in gesture recognition technology. Vision-based gesture recognition systems are primed for applications such as virtual reality, multimedia gaming and hands-free interaction with computers Another popular application is sign language recognition, which is the focus of this paper. Vision-based systems, on the other hand, provide a more natural environment within which to capture the gesture data The flipside of this method is that working with images requires intelligent feature extraction techniques in addition to image processing techniques like segmentation which may add to the computational complexity of the system. Working with isolated ArSL gestures, they eliminate the temporal dependency of data by accumulating successive prediction errors into one image that represents the motion information This removal of temporal dependency allows for simple classification methods, with less computational and storage requirements.

The Dataset
Feature Extraction
Proposed Feature Extraction
Adapted Feature Extraction Solutions
Accumulated Differences Solution
Classification
Experimental Results
Number of Hidden States
Length of the Feature Vector
Number of Gaussian Mixtures
Choice of Threshold
Conclusions and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.