Sign language recognition with multi feature fusion and ANN classifier

Sunitha Ravi,Venkata Vijay Kishore Polurie,Kiran Kumar Eepuri,Suman Maloji

doi:10.3906/elk-1711-139

Sunitha Ravi, Venkata Vijay Kishore Polurie + Show 2 more

Open Access

https://doi.org/10.3906/elk-1711-139

Copy DOI

Abstract

Extracting and recognizing complex human movements such as sign language gestures from video sequences is a challenging task. In this paper this kind of a difficult problem is approached with Indian sign language (ISL) videos. A new segmentation algorithm is developed by fusion of features from discrete wavelet transform (DWT) and local binary pattern (LBP). A 2D point cloud is formed from fused features, which represent the local hand shapes in consecutive video frames. We validate the proposed feature extraction model with state of the art features such as HOG, SIFT and SURF for each sign video on the same ANN classifier. We found that the Haar-LBP fused features represent sign video data in better manner compared to HOG, SIFT and SURF. This is due to the combination of global and local features in our proposed feature matrix. The extracted features input the artificial neural network (ANN) classifier with labels forming the corresponding words. The proposed ANN classifier is tested against state of the art classifiers such as Adaboost, support vector machine (SVM) and other ANN methods on different features extracted from the ISL dataset. The classifiers were tested for accuracy and correctness in identifying the signs. The ANN classifier that produced a recognition rate of 92.79 % was obtained with maximum training instances, which was far greater than the existing works on sign language with other features and ANN classifier on our ISL dataset.

Highlights

Automatic sign language recognition is a complicated problem for computer vision scientists, which involves mining and categorizing spatial patterns of human poses in videos
Sign language created from human action is defined as a temporal variation of human body in a video sequence, which is characterized by moving hands with respect to body, face, and head including hand shapes
Automatic sign extraction from sign video sequence is complicated due to complex hand poses and body actions performed at different speeds depending on the signer

Summary

Introduction

Automatic sign language recognition is a complicated problem for computer vision scientists, which involves mining and categorizing spatial patterns of human poses in videos. Sign language created from human action is defined as a temporal variation of human body in a video sequence, which is characterized by moving hands with respect to body, face, and head including hand shapes. The problem is to extract, identify a human pose, and classify into labels based on trained human signature action models [1]. The objective of this work is to extract the signature of Indian sign language poses from the videos giving a specific sign as input. Automatic sign extraction from sign video sequence is complicated due to complex hand poses and body actions performed at different speeds depending on the signer.

Objectives

Methods

Findings

Conclusion