ABSTRACT Sign language is a medium of communication for people with hearing disabilities. Static and dynamic gestures are identified in a video-based sign language recognition and translated them into humanly understandable phrases to achieve the communication objective. However, videos contain redundant Key-frames which require additional processing. Number of such Key-frames can be reduced. The selection of particular Key-frames without losing the required information is a challenging task. The Key-frame extraction algorithm is used which helps to speed-up the sign language recognition process by extracting essential key-frames. The proposed framework eliminates the computation overhead by picking up the distinct Key-frames for the recognition process. Discrete Wavelet Transform (DWT), Discrete Cosine Transform (DCT), and Histograms of Oriented Gradient (HOG) are used for unique features extraction. We used the bagged tree, boosted tree ensemble method, Fine KNN, and SVM for classification. We tested methodology on video-based datasets of Pakistani Sign Language. It achieved an overall 97.5% accuracy on 37 Urdu alphabets and 95.6% accuracy on 100 common words.
Read full abstract