Abstract

The spatio-temporal (ST) position information between local features plays an important role in action recognition task. To use the information, neighborhood-based features are built for describing local ST information around ST interest points. However, traditional methods of constructing neighborhood, such as sub-ST volumetric method and nearest-neighbor-based neighborhood method, ignore the orientation information of neighborhood. To make the neighborhood-based features more discriminative, we construct a novel, oriented neighborhood by imposing weights on the distance components. Specifically, in our scheme, firstly, local features are produced, and encoded by locality-constrained linear coding (LLC). Then, oriented neighborhoods are constructed by imposing weights on the distance components between features, and obtain single-scale oriented neighborhood features (SONFs). Next, multi-scale oriented neighborhood features (MONFs) are formed by concatenating SONFs. As a result, action video sequences are represented as a collection of MONFs. Finally, locality-constrained group sparse representation (LGSR) is used as classifier upon MONFs. Experimental results on the KTH and UCF Sports datasets show that our method achieves better performance than the competing local ST feature-based human action recognition methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call