Abstract

Densely sampled local features with bag-of-words models have been widely applied to action recognition. Conventional approaches assume that different kinds of local features are totally uncorrelated, and they are separately processed, encoded, and then fused at video-level representation. However, these local features are not totally uncorrelated in practice. To address this problem, multi-view local feature fusion is exploited for local descriptor fusion in action recognition. Specifically, tensor canonical correlation analysis (TCCA) is employed to obtain a fused local feature that carries the high-order correlation hidden among different types of local features. The high-order correlation local feature improves the conventional concatenation based fusion approach. Experimental results on three challenging action recognition datasets validate the effectiveness of the proposed approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.