Abstract

With the recent growth of Smart TV technology, the demand for unique and beneficial applications motivates the study of a unique gesture-based system for a smart TV-like environment. Combining movie recommendation, social media platform, call a friend application, weather updates, chatting app, and tourism platform into a single system regulated by natural-like gesture controller is proposed to allow the ease of use and natural interaction. Gesture recognition problem solving was designed through 24 gestures of 13 static and 11 dynamic gestures that suit to the environment. Dataset of a sequence of RGB and depth images were collected, preprocessed, and trained in the proposed deep learning architecture. Combination of three-dimensional Convolutional Neural Network (3DCNN) followed by Long Short-Term Memory (LSTM) model was used to extract the spatio-temporal features. At the end of the classification, Finite State Machine (FSM) communicates the model to control the class decision results based on application context. The result suggested the combination data of depth and RGB to hold 97.8% of accuracy rate on eight selected gestures, while the FSM has improved the recognition rate from 89% to 91% in a real-time performance.

Highlights

  • Gestures are one of the most natural ways of physical body movement, which can involve fingers, hands, head, face, or body to interact with the environment and convey meaningful information.Besides, gesture recognition is the way of the machine to classify or translate the gestures produced by a human into some meaningful commands

  • This study aims to introduce a hand gesture recognition system that works on a real-time application situation for Smart-TV like environment

  • Networks, the proposed multimodal architecture consist of 3D convolution neural network (3DCNN) layers, one stack long short-term memory (LSTM) layer and, a fully connected layer followed by the softmax layer

Read more

Summary

Introduction

Gestures are one of the most natural ways of physical body movement, which can involve fingers, hands, head, face, or body to interact with the environment and convey meaningful information. Deep learning model has deemed to solve the recognition and classification problems efficiently and accurately, yet the implementation in real-time application situations are limited. This study aims to introduce a hand gesture recognition system that works on a real-time application situation for Smart-TV like environment. The first application is utilized to test the model accuracy, which consists of a simple interface showing the recognition result in a real-time situation. The performance of the dataset of eight selected gestures with different settings deduces accuracy higher than 90% in offline and real-time testing.

Related Work
Proposed
Data Collection
Asperforming presents Figure
Data Preprocessing
Multimodal Architecture
The proposed proposed 3DCNN
Context-Aware
FSM Controller Model
Training and Validating
Experimental Result and Discussion order to test
Experimental
Comparison of Input Data
Comparison of Multimodal Input Data Result
Real-Time Experimental Result
Real-Time System
Findings
Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.