Human activities recognition from video images by using convolutional neural network

Dan Wang,Yanmin Zhang,Jingfa Yao

doi:10.3233/jifs-236068

Abstract

Nowadays, automatic human activity recognition from video images is necessary for monitoring applications and caring for disabled people. The use of surveillance cameras and the processing of the obtained images leads to the achievement of a smart, accurate system for the recognition of human behavior. Since human detection in different scenes is associated with many challenges, several approaches have been implemented to detect human activity from video image processing. Due to the complexity of human activities, background noises and other factors affect the detection. For the solution of these problems, two deep learning-based algorithms have been described in the current article. According to the convolutional neural networks, the LSTM + CNN method and the 3D CNN method have been used to recognize the human activities in the images of the video. Each algorithm is explained and analyzed in detail. The experiments designed in this paper are performed by two datasets: the HMDB-51 dataset and the UCF101 dataset. In the HMDB-51 dataset, the highest obtained accuracy for CNN + LSTM method was equal to 70.2 and for method 3D CNN equal to 54.4. In the UCF101 dataset, the highest obtained accuracy for CNN + LSTM method was equal to 95.1 and for method 3D CNN equal to 90.8.

Full Text