Video Event Research Articles

Weakly supervised video anomaly detection (WS-VAD) aims to identify the snippets involving anomalous events in long untrimmed videos, with solely video-level binary labels. A typical paradigm among the existing WS-VAD methods is to employ multiple modalities as inputs, e.g., RGB, optical flow, and audio, as they can provide sufficient discriminative clues that are robust to the diverse, complicated real-world scenes. However, such a pipeline has high reliance on the availability of multiple modalities and is computationally expensive and storage demanding in processing long sequences, which limits its use in some applications. To address this dilemma, we propose a privileged knowledge distillation (KD) framework dedicated to the WS-VAD task, which can maintain the benefits of exploiting additional modalities, while avoiding the need for using multimodal data in the inference phase. We argue that the performance of the privileged KD framework mainly depends on two factors: 1) the effectiveness of the multimodal teacher network and 2) the completeness of the useful information transfer. To obtain a reliable teacher network, we propose a cross-modal interactive learning strategy and an anomaly normal discrimination loss, which target learning task-specific cross-modal features and encourage the separability of anomalous and normal representations, respectively. Furthermore, we design both representation-and logits-level distillation loss functions, which force the unimodal student network to distill abundant privileged knowledge from the well-trained multimodal teacher network, in a snippet-to-video fashion. Extensive experimental results on three public benchmarks demonstrate that the proposed privileged KD framework can train a lightweight yet effective detector, for localizing anomaly events under the supervision of video-level annotations.

Video anomaly detection (VAD) refers to the discrimination of unexpected events in videos. The deep generative model (DGM)-based method learns the regular patterns on normal videos and expects the learned model to yield larger generative errors for abnormal frames. However, DGM cannot always do so, since it usually captures the shared patterns between normal and abnormal events, which results in similar generative errors for them. In this article, we propose a novel self-supervised framework for unsupervised VAD to tackle the above-mentioned problem. To this end, we design a novel self-supervised attentive generative adversarial network (SSAGAN), which is composed of the self-attentive predictor, the vanilla discriminator, and the self-supervised discriminator. On the one hand, the self-attentive predictor can capture the long-term dependences for improving the prediction qualities of normal frames. On the other hand, the predicted frames are fed to the vanilla discriminator and self-supervised discriminator for performing true-false discrimination and self-supervised rotation detection, respectively. Essentially, the role of the self-supervised task is to enable the predictor to encode semantic information into the predicted normal frames via adversarial training, in order for the angles of rotated normal frames can be detected. As a result, our self-supervised framework lessens the generalization ability of the model to abnormal frames, resulting in larger detection errors for abnormal frames. Extensive experimental results indicate that SSAGAN outperforms other state-of-the-art methods, which demonstrates the validity and advancement of SSAGAN.

Video Event Research Articles

Related Topics

Articles published on Video Event

A Deep Learning Framework for Monitoring Audience Engagement in Online Video Events

Deep learning based anomaly detection in real-time video

Modeling and Performance Analysis of a Notification-Based Method for Processing Video Queries on the Fly

Local feature‐based video captioning with multiple classifier and CARU‐attention

Basketball Tactics Analysis Based on Improved Openpose Algorithm and Its Application

Video Event Extraction with Multi-View Interaction Knowledge Distillation

HiFormer: Hierarchical transformer for grounded situation recognition

Background separation network for video anomaly detection

TransGANomaly: Transformer based Generative Adversarial Network for Video Anomaly Detection

An Effective Video Event Classification by Optimizing the Hyper-Parameters Using Improved Pelican Optimization and Bi-LSTM Classifier

Spatiotemporal Masked Autoencoder with Multi-Memory and Skip Connections for Video Anomaly Detection

Distilling Privileged Knowledge for Anomalous Event Detection From Weakly Labeled Videos.

Clustering Aided Weakly Supervised Training to Detect Anomalous Events in Surveillance Videos.

A Classification Method of Sports Video Events based on Hierarchical Deep Network

Automated Video Events Detection and Classification using CNN-GRU Model

Anomaly detection method based on temporal spatial information enhancement

Fine-gained Motion Enhancement for action recognition: Focusing on action-related regions

Self-Supervised Attentive Generative Adversarial Networks for Video Anomaly Detection.

Cross-media web video event mining based on multiple semantic-paths embedding

Deep Learning Based Video Event Classification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Video Event Research Articles

Related Topics

Articles published on Video Event

A Deep Learning Framework for Monitoring Audience Engagement in Online Video Events

Deep learning based anomaly detection in real-time video

Modeling and Performance Analysis of a Notification-Based Method for Processing Video Queries on the Fly

Local feature‐based video captioning with multiple classifier and CARU‐attention

Basketball Tactics Analysis Based on Improved Openpose Algorithm and Its Application

Video Event Extraction with Multi-View Interaction Knowledge Distillation

HiFormer: Hierarchical transformer for grounded situation recognition

Background separation network for video anomaly detection

TransGANomaly: Transformer based Generative Adversarial Network for Video Anomaly Detection

An Effective Video Event Classification by Optimizing the Hyper-Parameters Using Improved Pelican Optimization and Bi-LSTM Classifier

Spatiotemporal Masked Autoencoder with Multi-Memory and Skip Connections for Video Anomaly Detection

Distilling Privileged Knowledge for Anomalous Event Detection From Weakly Labeled Videos.

Clustering Aided Weakly Supervised Training to Detect Anomalous Events in Surveillance Videos.

A Classification Method of Sports Video Events based on Hierarchical Deep Network

Automated Video Events Detection and Classification using CNN-GRU Model

Anomaly detection method based on temporal spatial information enhancement

Fine-gained Motion Enhancement for action recognition: Focusing on action-related regions

Self-Supervised Attentive Generative Adversarial Networks for Video Anomaly Detection.

Cross-media web video event mining based on multiple semantic-paths embedding

Deep Learning Based Video Event Classification