Abstract
Anomaly detection relies on designing a score to determine whether a particular event is uncharacteristic of a given background distribution. One way to define a score is to use autoencoders, which rely on the ability to reconstruct certain types of data (background) but not others (signals). In this paper, we study some challenges associated with variational autoencoders, such as the dependence on hyperparameters and the metric used, in the context of anomalous signal (top and W) jets in a QCD background. We find that the hyperparameter choices strongly affect the network performance and that the optimal parameters for one signal are non-optimal for another. In exploring the networks, we uncover a connection between the latent space of a variational autoencoder trained using mean-squared-error and the optimal transport distances within the dataset. We then show that optimal transport distances to representative events in the background dataset can be used directly for anomaly detection, with performance comparable to the autoencoders. Whether using autoencoders or optimal transport distances for anomaly detection, we find that the choices that best represent the background are not necessarily best for signal identification. These challenges with unsupervised anomaly detection bolster the case for additional exploration of semi-supervised or alternative approaches.
Highlights
Anomaly score for individual events can be determined from the background ensemble and used for discrimination, without needing to characterize the full probability distribution of the signal ensemble
We study some challenges associated with variational autoencoders, such as the dependence on hyperparameters and the metric used, in the context of anomalous signal jets in a QCD background
Since we can think of the variational autoencoders (VAEs) anomaly score as a “distance” encoding how far any given event is from the background distribution, it is natural to ask about the distances between individual events
Summary
Anomaly score for individual events can be determined from the background ensemble and used for discrimination, without needing to characterize the full probability distribution of the signal ensemble. The latent space of the VAE encodes the probability distribution of the background training sample, which can be used in the anomaly score. The task of an autoencoder, variational or not, for unsupervised anomaly detection is to provide a strong universal signal/background discriminant for a variety of signals having access only to background for training. In principle, this approach is advantageous because it opens the possibility to bypass Monte Carlo simulations and work directly with experimental data, which is almost completely background.. Since we can think of the VAE anomaly score as a “distance” encoding how far any given event is from the background distribution, it is natural to ask about the distances between individual events. We study Wasserstein distances in particular because they were physically motivated in refs. [56–58]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.