Abstract

Video Anomaly detection (VAD) with weakly supervised is usually formulated as a multiple instance learning (MIL) problem. Although the current MIL-based methods have achieved promising detection performance, the temporal dependencies in videos are not well exploited. There may multiple abnormal clips in a given anomaly video, while the previous work only focused on the most abnormal one. To address above issues, a temporal context alignment (TCA) network for video anomaly detection is proposed in this work. Its merits are three-fold, 1) a sparse continuous sampling strategy is proposed to adapt the varying length of untrimmed videos; 2) a multi-scale attention module is used to establish the video temporal dependencies; 3) a top-k loss strategy is used to enlarge the distance between the top-k normal and abnormal clips. Extensive experiments demonstrate the noticeable anomaly discriminability of the proposed network on two public datasets (ShanghaiTech and UCF-Crime).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.