Abstract

During the past decades, recognition of human activities has attracted the attention of numerous researches due to its outstanding applications including smart houses, health-care and monitoring the private and public places. Applying to the video frames, this paper proposes a hybrid method which combines the features extracted from the images using the ‘scale-invariant features transform’ (SIFT), ‘histogram of oriented gradient’ (HOG) and ‘global invariant features transform’ (GIST) descriptors and classifies the activities by means of the deep belief network (DBN). First, in order to avoid ineffective features, a pre-processing course is performed on any image in the dataset. Then, the mentioned descriptors extract several features from the image. Due to the problems of working with a large number of features, a small and distinguishing feature set is produced using the bag of words (BoW) technique. Finally, these reduced features are given to a deep belief network in order to recognize the human activities. Comparing the simulation results of the proposed approach with some other existing methods applied to the standard PASCAL VOC Challenge 2010 database with nine different activities demonstrates an improvement in the accuracy, precision and recall measures (reaching 96.39%, 85.77% and 86.72% respectively) for the approach of this work with respect to the other compared ones in the human activity recognition.

Highlights

  • As the diversity of the applications of supervisory and security systems grows, the need for smart algorithms which are able to detect activities and behaviors of the people is intensified

  • A novel approach was proposed in this work for the human activity recognition (HAR) application

  • Multiple robust features including scaleinvariant features transform’ (SIFT), histogram of oriented gradient’ (HOG) and global invariant features transform’ (GIST) were extracted followed by the bag of words (BoW) technique for feature reduction

Read more

Summary

Introduction

As the diversity of the applications of supervisory and security systems grows, the need for smart algorithms which are able to detect activities and behaviors of the people is intensified. Progresses in data collecting and analysis technologies have led to wide usage of the human activity recognition (HAR) systems in the daily life Applications such as security and surveillance, crowd management, content-based image retrieval, action retrieval in images, user interface design, human-computer interaction, robot learning, sport images analysis and eHealth have raised the attention of the researchers to propose various methods for recognizing the human activities [1, 2]. Despite the great progresses in pattern classification, the human activity recognition in static images is still considered a big challenge In this regard, several issues such as images with different and complicated backgrounds, high volume of data, images from different views, low intra-class similarity (doing a single action in different ways by different people), low inter-class variability (e.g., the similarity between drinking and eating) and lack of temporal information have led to difficulties in the human activity recognition using static images [1]. The human activity recognition in static images includes four steps, as follow: (1) pre-processing: applying a set of operations to the images with the aim of image enhancement and reduction of noise and redundancy; (2) feature extraction: computing and finding effective and distinctive features; (3) feature reduction: decreasing the number of features by keeping or producing the most

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call