DVAEGMM: Dual Variational Autoencoder With Gaussian Mixture Model for Anomaly Detection on Attributed Networks

Wasim Khan,Mohammad Haroon,Ahmad Neyaz Khan,Mohammad Kamrul Hasan,Asif Khan,Umi Asma Mokhtar,Shayla Islam

doi:10.1109/access.2022.3201332

Wasim Khan, Mohammad Haroon + Show 5 more

Open Access

https://doi.org/10.1109/access.2022.3201332

Copy DOI

Abstract

A significant aspect of today’s digital information is attributed networks, which combine multiple node attributes with the basic network topology in order to extract knowledge. Anomaly Detection on attributed networks has recently drawn significant attention from researchers and is widely used in a number of high-impact areas. The majority of current approaches focus on shallow learning methods such as community analysis, ego network or selection of subspace method. These approaches have network sparsity and data nonlinearity problems, and they do not even capture the intricate relationships between various information sources. Deep learning approaches like graph autoencoders are utilized to perform anomaly detection through obtaining node embeddings while dealing with the network nonlinearity and sparsity issues. However, they suffer from the problem of ignoring the latent codes’ embedding distribution, which results in poor representation in many instances. In this paper, we propose a new framework called DVAEGMM to detect anomalies on attributed networks. First, our framework utilizes a dual variational autoencoder for capturing the complex cross-modality relationships between node attributes and network structure, similar to vanilla autoencoders, but it also considers the potential data distribution and makes use of a generative adversarial network (GAN) for an adversarial regularization approach. An adversarial mechanism makes the encoder make more accurate estimates of how potential features might be distributed. As a result, decoders can make graphs that are more like the original graph. Each input data point is represented by a low-dimensional representation and a probability of reconstruction by the algorithm. Lastly, the Gaussian Mixture Model, a distinct estimation network, is used to approximate the latent vector density, resulting in the detection of anomalies from measuring sample energy. They are trained jointly as an end-to-end framework. DVAEGMM helps in the simultaneous optimization of the mixture model, generative adversarial network, and variational autoencoder parameters. The joint optimization balances the reconstruction probability, the latent representation density approximation, and regularization. Extensive experiments on attributed networks prove that DVAEGMM significantly beats the existing methods, proving the efficiency of the presented approach. The AUC scores of our proposed framework for the BlogCatalog, Flickr, Enron, and Amazon datasets are 0.89380, 0.87130, 0.72480, and 0.75102, respectively.

Full Text