Challenges for unsupervised anomaly detection in particle physics

Katherine Fraser,Samuel Homiller,Rashmish K Mishra,Matthew D Schwartz,Bryan Ostdiek

doi:10.1007/jhep03(2022)066

Abstract

Anomaly detection relies on designing a score to determine whether a particular event is uncharacteristic of a given background distribution. One way to define a score is to use autoencoders, which rely on the ability to reconstruct certain types of data (background) but not others (signals). In this paper, we study some challenges associated with variational autoencoders, such as the dependence on hyperparameters and the metric used, in the context of anomalous signal (top and W) jets in a QCD background. We find that the hyperparameter choices strongly affect the network performance and that the optimal parameters for one signal are non-optimal for another. In exploring the networks, we uncover a connection between the latent space of a variational autoencoder trained using mean-squared-error and the optimal transport distances within the dataset. We then show that optimal transport distances to representative events in the background dataset can be used directly for anomaly detection, with performance comparable to the autoencoders. Whether using autoencoders or optimal transport distances for anomaly detection, we find that the choices that best represent the background are not necessarily best for signal identification. These challenges with unsupervised anomaly detection bolster the case for additional exploration of semi-supervised or alternative approaches.

Highlights

Anomaly score for individual events can be determined from the background ensemble and used for discrimination, without needing to characterize the full probability distribution of the signal ensemble
We study some challenges associated with variational autoencoders, such as the dependence on hyperparameters and the metric used, in the context of anomalous signal jets in a QCD background
Since we can think of the variational autoencoders (VAEs) anomaly score as a “distance” encoding how far any given event is from the background distribution, it is natural to ask about the distances between individual events

Summary

Introduction

Anomaly score for individual events can be determined from the background ensemble and used for discrimination, without needing to characterize the full probability distribution of the signal ensemble. The latent space of the VAE encodes the probability distribution of the background training sample, which can be used in the anomaly score. The task of an autoencoder, variational or not, for unsupervised anomaly detection is to provide a strong universal signal/background discriminant for a variety of signals having access only to background for training. In principle, this approach is advantageous because it opens the possibility to bypass Monte Carlo simulations and work directly with experimental data, which is almost completely background.. Since we can think of the VAE anomaly score as a “distance” encoding how far any given event is from the background distribution, it is natural to ask about the distances between individual events. We study Wasserstein distances in particular because they were physically motivated in refs. [56–58]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of High Energy Physics	Publication Date: Mar 1, 2022
Citations: 23	License type: open-access

R Discovery Prime

R Discovery Prime

Challenges for unsupervised anomaly detection in particle physics

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of High Energy Physics

Lead the way for us

Similar Papers

Unsupervised Anomaly Detection Using Variational Auto-Encoder based Feature Extraction
Rong Yao ... Peng Peng
-
Rong Yao, et. al.Rong Yao ... Peng Peng
01 Jun 2019
01 Jun 2019

Research on unsupervised anomaly data detection method based on improved automatic encoder and Gaussian mixture model
Xiangyu Liu ... Shibing Zhu
Journal of Cloud Computing | VOL. 11
Xiangyu Liu, et. al.Xiangyu Liu ... Shibing Zhu
29 Sep 2022
Journal of Cloud Computing | VOL. 11

Optimal transport for mitigating cycle skipping in full-waveform inversion: A graph-space transform approach
Ludovic Métivier ... Edouard Oudet
GEOPHYSICS | VOL. 83
Ludovic Métivier, et. al.Ludovic Métivier ... Edouard Oudet
01 Sep 2018
GEOPHYSICS | VOL. 83

AutoML for Outlier Detection with Optimal Transport Distances
Prabhant Singh ... Joaquin Vanschoren
-
Prabhant Singh, et. al.Prabhant Singh ... Joaquin Vanschoren
01 Aug 2023
01 Aug 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Challenges for unsupervised anomaly detection in particle physics

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of High Energy Physics