Abstract
Video anomaly detection is the identification of outliers deviating from the norm within a series of videos. The spatio-temporal dependencies and unstructured nature of videos make video anomaly detection complicated. Many existing methods cannot detect anomalies accurately because they are unable to learn from the learning data effectively and capture dependencies between distant frames. To this end, we propose a model that uses a pre-trained vision transformer and an ensemble of deep convolutional auto-encoders to capture dependencies between distant frames. Moreover, AdaBoost training is used to ensure the model learns every sample in the data properly. To evaluate the method, we conducted experiments on four publicly available video anomaly detection datasets, namely the CUHK Avenue dataset, ShanghaiTech, UCSD Ped1, and UCSD Ped2, and achieved AUC scores of 93.4 %, 78.8 %, 93.5 %, and 95.7 % for these datasets, respectively. The experimental results demonstrate the flexibility and generalizability of the proposed method for video anomaly detection, coming from robust features extracted by a pre-trained vision transformer and efficient learning of data representations by employing the AdaBoost training strategy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.