Abstract

The autoencoder (AE) is a fundamental deep learning approach to anomaly detection. AEs are trained on the assumption that abnormal inputs will produce higher reconstruction errors than normal ones. In practice, however, this assumption is unreliable in the unsupervised case, where the training data may contain anomalous examples. Given sufficient capacity and training time, an AE can generalize to such an extent that it reliably reconstructs anomalies. Consequently, the ability to distinguish anomalies via reconstruction errors is diminished. We respond to this limitation by introducing three new methods to more reliably train AEs for unsupervised anomaly detection: cumulative error scoring (CES), percentile loss (PL), and early stopping via knee detection. We demonstrate significant improvements over conventional AE training on image, remote-sensing, and cybersecurity datasets.

Highlights

  • (outlier) detection is of critical importance across many domains, including fraud identification, video surveillance, medical applications, remote sensing, and network monitoring

  • Conventional autoencoders (AE) used for unsupervised anomaly detection are prone to over-generalize to the anomalies present in the training data

  • This reduces the ability of AEs to identify abnormal data based on measures of reconstruction error

Read more

Summary

INTRODUCTION

(outlier) detection is of critical importance across many domains, including fraud identification, video surveillance, medical applications, remote sensing, and network monitoring. Semi-supervised approaches to deep anomaly detection (DAD) attempt to circumvent the need for labeled anomalous examples by using only readily available, normal examples to build a model of the data. We expand upon our previous related work by demonstrating the ability of CES and PL to prevent AEs from generalizing anomalies across a number of applications, allowing greater reliability in unsupervised DAD [11]. Traditional methods of unsupervised anomaly detection generally use measurements of distances, densities, or clustering to differentiate normal and anomalous points. Amongst these are k-means, nearest-neighbor, and Gaussian mixture models [21]. Specifying a small number of burn-in epochs, b, acts to ignore the some of the initial period of training where the reconstruction errors are not reliably indicative of normality

PERCENTILE LOSS
EXPERIMENTS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call