Abstract

Human action recognition (HAR) has gained much attention in the last few years due to its enormous applications including human activity monitoring, robotics, visual surveillance, to name but a few. Most of the previously proposed HAR systems have focused on using hand-crafted images features. However, these features cover limited aspects of the problem and show performance degradation on a large and complex datasets. Therefore, in this work, we propose a novel HAR system which is based on the fusion of conventional hand-crafted features using histogram of oriented gradients (HoG) and deep features. Initially, human silhouette is extracted with the help of saliency-based method - implemented in two phases. In the first phase, motion and geometric features are extracted from the selected channel, whilst, second phase calculates the Chi-square distance between the extracted and threshold-based minimum distance features. Afterwards, extracted deep CNN and hand-crafted features are fused to generate a resultant vector. Moreover, to cope with the curse of dimensionality, an entropy-based feature selection technique is also proposed to identify the most discriminant features for classification using multi-class support vector machine (M-SVM). All the simulations are performed on five publicly available benchmark datasets including Weizmann, UCF11 (YouTube), UCF Sports, IXMAS, and UT-Interaction. A comparative evaluation is also presented to show that our proposed model achieves superior performances in comparison to a few exiting methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call