Abstract

This paper proposes a new framework for modeling temporal structures of complex human actions. Inspired by the fact that a complex action is the temporally ordered composition of sub-actions, we develop a new model named Bag-of-Sequencelets (BoS). To construct a BoS model, a video is represented as a sequence of Primitive Actions (PAs). A PA is a representative motion pattern that constitutes actions and is learned in an unsupervised manner. Representing a video as a sequence of PAs preserves their temporal order. A sequencelet is an informative sub-sequence that describes the partial structure of actions while preserving temporal relations among PAs. In a BoS model, an action is modeled as an ensemble of sequencelets. We can use sequential pattern mining to automatically learn the sequencelet without any annotation or prior knowledge of action structure. Because the BoS model has both compositional and chronological properties, it can effectively model the structures of complex actions despite intra-class variations such as viewpoint change. Experimental results show the effectiveness of the BoS model in temporal structure modeling. Applied to the Olympic sports and UCF YouTube datasets, BoS achieves greater classification accuracy than state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call