Abstract

Human action recognition techniques have gained significant attention among next-generation technologies due to their specific features and high capability to inspect video sequences to understand human actions. As a result, many fields have benefited from human action recognition techniques. Deep learning techniques played a primary role in many approaches to human action recognition. The new era of learning is spreading by transfer learning. Accordingly, this study's main objective is to propose a framework with three main phases for human action recognition. The phases are pre-training, preprocessing, and recognition. This framework presents a set of novel techniques that are three-fold as follows, (i) in the pre-training phase, a standard convolutional neural network is trained on a generic dataset to adjust weights; (ii) to perform the recognition process, this pre-trained model is then applied to the target dataset; and (iii) the recognition phase exploits convolutional neural network and long short-term memory to apply five different architectures. Three architectures are stand-alone and single-stream, while the other two are combinations between the first three in two-stream style. Experimental results show that the first three architectures recorded accuracies of 83.24%, 90.72%, and 90.85%, respectively. The last two architectures achieved accuracies of 93.48% and 94.87%, respectively. Moreover, The recorded results outperform other state-of-the-art models in the same field.

Highlights

  • Understanding human actions by inspecting video sequences has become an essential research topic

  • Human Action Recognition (HAR) technology enables the computer to achieve this level of understanding

  • This paper proposes a Transfer Learning-based Human Action Recognition (TL-HAR) framework

Read more

Summary

Introduction

Understanding human actions by inspecting video sequences has become an essential research topic. The main idea was to use two CNN networks for modeling spatial and temporal information. There are three main scenarios of CNN transfer learning: fixed feature extraction, fine-tuning and layers freezing, and pre-trained models [19]. These pre-trained architectures are adopted to fine-tune each CNN network with a different dataset in the last scenario. This paper proposes a Transfer Learning-based Human Action Recognition (TL-HAR) framework.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.