Abstract

Understanding patterns in surveillance videos is always challenging due to the rapid movement of the crowd, occlusions, cluttered backgrounds, intraclass variations, and interclass similarities existing among normal and abnormal event classes. In the case of autoencoder-based normality model trained using segments of normal events only, there may be chances of anomalous events to be reconstructed well, leading to anomalous events detected as normal events. In this work, we propose deep discriminative embedding (DDE)-based framework using residual spatiotemporal auotencoders (R-STAEs) that learn V2AnomalyVec embeddings. In place of R-STAEs, any promising deep architectures to extract deep features from video segments of normal and abnormal events can also be used to form V2AnomalyVec. Results on four benchmark datasets demonstrate that the V2AnomalyVec-based approach performs significantly better than the normality model-based approaches and other state-of-the-art approaches. Studies with a reduced number of abnormal video segments to train the model prove that the proposed approach is also appropriate for detecting anomalous events even with less amount of abnormal training data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.