Abstract
In last years, most human action recognition works have used dense trajectories features, to achieve state-of-the-art results. Histograms of Oriented Gradients (HOG), Histogram of Optical Flow (HOF) and Motion Boundary Histograms (MBH) features are extracted from regions and being tracked across the frames.The goal of this paper is to improve the performance obtained by means of Improved Dense Trajectories (IDTs), adding new features based on temporal templates. We construct these templates considering a video sequence as a third-order tensor and computing three different projections. We use several functions for projecting the fibers from the video sequences, and combined them by means of sum pooling.As a first contribution of our work, we present in detail the method based on tensor projections. First, we have assessed the results obtained using only template based action recognition. Next, in order to achieve state-of-art recognition rates, we have fused our features with those of IDTs.This is the second contribution of the article.Experiments on four different public datasets have shown that this technique improves IDTs performance and that the results outperform the ones obtained by most of the state-of-the-art techniques for action recognition.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have