Abstract

It has been argued in recent work that correlation filters are attractive for human action recognition from videos. Motivation for their employment in this classification task lies in their ability to: (i) specify where the filter should peak in contrast to all other shifts in space and time, (ii) have some degree of tolerance to noise and intra-class variation (allowing learning from multiple examples), and (iii) can be computed deterministically with low computational overhead. Specifically, Maximum Average Correlation Height (MACH) filters have exhibited encouraging results~\cite{Mikel} on a variety of human action datasets. Here, we challenge the utility of correlation filters, like the MACH filter, in these circumstances. First, we demonstrate empirically that identical performance can be attained to the MACH filter by simply taking the~\emph{average} of the same action specific training examples. Second, we characterize theoretically and empirically under what circumstances a MACH filter would become equivalent to the average of the action specific training examples. Based on this characterization, we offer an alternative type of filter, based on a discriminative paradigm, that circumvent the inherent limitations of correlation filters for action recognition and demonstrate improved action recognition performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call