Abstract
Human action recognition is still a challenging topic in the computer vision field that has attracted a large number of researchers. It has a significant importance in varieties of applications such as intelligent video surveillance, sports analysis, and human–computer interaction. Recent works attempt to exploit the progress in deep learning architecture to learn spatial and temporal features from action video. However, it remains unclear how to combine spatial and temporal information with convolutional neural network. In this paper, we propose a novel human action recognition method by fusing spatial and temporal features learned from a simple unsupervised convolutional neural network called principal component analysis network (PCANet) in combination with bag-of-features (BoF) and vector of locally aggregated descriptors (VLAD ) encoding schemes. Firstly, both spatial and temporal features are learned via PCANet using a subset of frames and temporal templates for each video, while their dimensionality is reduced using whitening transformation (WT). The temporal templates are calculated using short-time motion energy images (ST-MEI) based on frame differencing. Then, the encoding scheme is applied to represent the final dual spatiotemporal PCANet features by feature fusion. Finally, the support vector machine (SVM) classifier is exploited for action recognition. Extensive experiments have been performed on two popular datasets, namely KTH and UCF sports, to evaluate the performance of proposed method. Our experimental results using leave-one-out evaluation strategy demonstrate that the proposed method presents satisfactory and comparable results on both datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.