Abstract

Action recognition plays a fundamental role in computer vision and has drawn growing attention recently. This paper addresses this issue conditioned on extreme Low Resolution (abbreviated as eLR). Generally, eLR video is often susceptible to noise, thus extracting a robust representation is of great challenge. Besides, due to the limitation of video resolution, eLR video cannot be cropped or resized randomly, then it is inevitably complicated to design and to train a deep network for eLR video. This paper proposes a novel network for robust video representation by employing pseudo tensor low rank regularization. A new Video Low Rank Representation model (named VLRR) is first proposed to recover the inherent robust component of a given video, and then the recovered term is introduced to a convolutional Network (denoted pLRN) as an auxiliary pseudo Low Rank guidance. Benefitting from the auxiliary guidance, pLRN can learn an approximate low rank term end-to-end. Besides, this paper presents a new initialization strategy for eLR recognition neTwork based on Tensor factorization (dubbed TenneT). TenneT is data-driven and learns the convolutional kernels totally from the video distribution while without any back-propagation. It outperforms random initialization both in speed and accuracy. Experiments on benchmark datasets demonstrate the effectiveness and superiority of the proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.