A Self-Attention Augmented Graph Convolutional Clustering Networks for Skeleton-Based Video Anomaly Behavior Detection

Chengming Liu,Yinghao Li,Ronghua Fu,Lei Shi,Yufei Gao,Weiwei Li

doi:10.3390/app12010004

Abstract

In this paper, we propose a new method for detecting abnormal human behavior based on skeleton features using self-attention augment graph convolution. The skeleton data have been proved to be robust to the complex background, illumination changes, and dynamic camera scenes and are naturally constructed as a graph in non-Euclidean space. Particularly, the establishment of spatial temporal graph convolutional networks (ST-GCN) can effectively learn the spatio-temporal relationships of Non-Euclidean Structure Data. However, it only operates on local neighborhood nodes and thereby lacks global information. We propose a novel spatial temporal self-attention augmented graph convolutional networks (SAA-Graph) by combining improved spatial graph convolution operator with a modified transformer self-attention operator to capture both local and global information of the joints. The spatial self-attention augmented module is used to understand the intra-frame relationships between human body parts. As far as we know, we are the first group to utilize self-attention for video anomaly detection tasks by enhancing spatial temporal graph convolution. Moreover, to validate the proposed model, we performed extensive experiments on two large-scale publicly standard datasets (i.e., ShanghaiTech Campus and CUHK Avenue datasets) which reveal the state-of-art performance for our proposed approach when compared to existing skeleton-based methods and graph convolution methods.

Highlights

Video anomaly detection is a highly challenging task in unsupervised video analysis
The key contributions of this work are summarized in this paper as follows: (1) We propose a novel spatial temporal self-attention augmented graph convolutional clustering networks for skeleton-based video anomaly detection tasks by employing the spatial temporal self-attention augmented graph convolutional autoencoder to extract the relevant features and embedded clustering; (2) We design a new spatial self-attention enhancement graph convolution operator to understand the intra-frame interaction between different body parts and capture the local and global features of a skeleton in the frame; (3) Our model achieves state-of-the-art area under ROC curve (AUC) of 0.789 for the ShanghaiTech Campus anomaly detection datasets and exhibits excellent performance metrics for CUHK Avenue datasets
We proved that the SAA-graph convolution baseline (Graph) can achieve a more flexible and dynamic representation between skeletons while overcoming the locality of graph convolution

Summary

Introduction

Video anomaly detection is a highly challenging task in unsupervised video analysis. In recent years, surveillance video anomaly detection has gained widespread attention owing to its applications in public security, social security management, and the rising trends in deep learning and computer vision. We use self-attention to solve the locality of the graph convolution operator by capturing the global information in the skeleton data. The key contributions of this work are summarized in this paper as follows: (1) We propose a novel spatial temporal self-attention augmented graph convolutional clustering networks for skeleton-based video anomaly detection tasks by employing the spatial temporal self-attention augmented graph convolutional autoencoder to extract the relevant features and embedded clustering; (2) We design a new spatial self-attention enhancement graph convolution operator to understand the intra-frame interaction between different body parts and capture the local and global features of a skeleton in the frame; (3) Our model achieves state-of-the-art AUC of 0.789 for the ShanghaiTech Campus anomaly detection datasets and exhibits excellent performance metrics for CUHK Avenue datasets. Pose graphs and a Dirichlet process mixture for video anomaly detection with a new coarse-grained setting for exploring broader aspects of video anomaly detection

Skeleton-Based Action Recognition

Transformer

Graph Convolutional Neural Networks

Proposed Method

Spatiotemporal Graph Connection Configuration for Skeleton

Spatial Graph Convolution

Deep Embedded Clustering

Normality Score

Experiment

Dataset

Implementation Details

Comparison with State-of-the-Art Methods

Method

Ablation Study

The Visualization of SAA-Graph

Findings

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Dec 21, 2021
Citations: 14	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Self-Attention Augmented Graph Convolutional Clustering Networks for Skeleton-Based Video Anomaly Behavior Detection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Spatial Residual Layer and Dense Connection Block Enhanced Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition
Cong Wu ... Josef Kittler
-
Cong Wu, et. al.Cong Wu ... Josef Kittler
01 Oct 2019
01 Oct 2019

Spatial Temporal Variation Graph Convolutional Networks (STV-GCN) for Skeleton-Based Emotional Action Recognition
Ming-Fong Tsai ... Chiung-Hung Chen
IEEE Access | VOL. 9
Ming-Fong Tsai, et. al.Ming-Fong Tsai ... Chiung-Hung Chen
01 Jan 2020
IEEE Access | VOL. 9

Enhanced Spatial and Extended Temporal Graph Convolutional Network for Skeleton-Based Action Recognition.
Fanjia Li ... Juanjuan Li
Sensors | VOL. 20
Fanjia Li, et. al.Fanjia Li ... Juanjuan Li
15 Sep 2020
Sensors | VOL. 20

A spatial attentive and temporal dilated (SATD) GCN for skeleton‐based action recognition
Jiaxu Zhang ... Yongtao Qin
CAAI Transactions on Intelligence Technology | VOL. 7
Jiaxu Zhang, et. al.Jiaxu Zhang ... Yongtao Qin
17 Mar 2021
CAAI Transactions on Intelligence Technology | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Self-Attention Augmented Graph Convolutional Clustering Networks for Skeleton-Based Video Anomaly Behavior Detection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences