Abstract

The semi-supervised video anomaly detection assumes that only normal video clips are available for training. Therefore, the intuitive idea is either to learn a dictionary by sparse coding or to train encoding-decoding neural networks by minimizing the reconstruction errors. For the former, the optimization of sparse coefficients is extremely time-consuming. For the latter, this manner cannot guarantee that an abnormal data corresponds to a larger reconstruction error due to the strong generalization of neural networks. To remedy their weaknesses and leverage their strengths, we propose a Fast Sparse Coding Network (FSCN) based on High-level Features. First, we propose a two-stream neural network to extract Spatial-Temporal Fusion Features (STFF) in hidden layers. With the STFF at hand, we use a Fast Sparse Coding Network to build a normal dictionary. By leveraging the predictor to produce approximate sparse coefficients, our FSCN generates sparse coefficients within a forward pass, which is simple and computationally efficient. Compared with traditional sparse coding based methods, FSCN is hundreds of or even thousands of times faster at the test stage. Extensive experiments on benchmark datasets demonstrate that our method reaches the state-of-the-art level.11Code will be released at https://github.com/Roc-Ng/FSCN_AnomalyDetection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.