Abstract

In this paper, we propose a spatio-temporal human gesture recognition algorithm under degraded conditions using three-dimensional integral imaging and deep learning. The proposed algorithm leverages the advantages of integral imaging with deep learning to provide an efficient human gesture recognition system under degraded environments such as occlusion and low illumination conditions. The 3D data captured using integral imaging serves as the input to a convolutional neural network (CNN). The spatial features extracted by the convolutional and pooling layers of the neural network are fed into a bi-directional long short-term memory (BiLSTM) network. The BiLSTM network is designed to capture the temporal variation in the input data. We have compared the proposed approach with conventional 2D imaging and with the previously reported approaches using spatio-temporal interest points with support vector machines (STIP-SVMs) and distortion invariant non-linear correlation-based filters. Our experimental results suggest that the proposed approach is promising, especially in degraded environments. Using the proposed approach, we find a substantial improvement over previously published methods and find 3D integral imaging to provide superior performance over the conventional 2D imaging system. To the best of our knowledge, this is the first report that examines deep learning algorithms based on 3D integral imaging for human activity recognition in degraded environments.

Highlights

  • Human gesture recognition involves deriving meaningful inference from human motions and has a wide range of applications in human-computer interaction, patient monitoring, surveillance, robotics, sign language recognition, etc. [1]

  • Numerous approaches have been proposed for gesture recognition, including the use of mathematical models such as Hidden Markov Models (HMM) [2], spatio-temporal interest points-based detectors (STIPs) [3], correlation filter-based approaches [4], etc

  • The low illumination effects considered in the experiments have been simulated by computational models applied to the experimentally captured elemental images in order to generate a large number of low light data for testing

Read more

Summary

Introduction

Human gesture recognition involves deriving meaningful inference from human motions and has a wide range of applications in human-computer interaction, patient monitoring, surveillance, robotics, sign language recognition, etc. [1]. Deep learning-based models for gesture recognition have gained wide acceptance due to their generalization capabilities and high accuracy in detecting and classifying gestures [5,6]. While these methods have been shown to work well for clean datasets, gesture recognition under degraded conditions remains a challenge, especially in cases where gestures are partially occluded, or in low illumination conditions. The features of the gestures may not be fully recorded during the camera pickup process, which can make gesture recognition under such conditions more challenging

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.