Abstract

We introduce a potentially powerful new method of searching for new physics at the LHC, using autoencoders and unsupervised deep learning. The key idea of the autoencoder is that it learns to map "normal" events back to themselves, but fails to reconstruct "anomalous" events that it has never encountered before. The reconstruction error can then be used as an anomaly threshold. We demonstrate the effectiveness of this idea using QCD jets as background and boosted top jets and RPV gluino jets as signal. We show that a deep autoencoder can significantly improve signal over background when trained on backgrounds only, or even directly on data which contains a small admixture of signal. Finally we examine the correlation of the autoencoders with jet mass and show how the jet mass distribution can be stable against cuts in reconstruction loss. This may be important for estimating QCD backgrounds from data. As a test case we show how one could plausibly discover 400 GeV RPV gluinos using an autoencoder combined with a bump hunt in jet mass. This opens up the exciting possibility of training directly on actual data to discover new physics with no prior expectations or theory prejudice.

Highlights

  • Deep learning is a hot topic in high energy physics

  • Applications of deep learning in high energy physics do not stop at classification tasks; pileup removal [34], generative models [35], and many others have all been studied

  • We show how by using the convolutional neural networks (CNNs) autoencoder, a bump hunt in jet mass could potentially reveal the presence of 400 GeV R-parity violating (RPV) gluinos in the actual data

Read more

Summary

INTRODUCTION

Deep learning is a hot topic in high energy physics. It has been applied to tagging boosted jets of various kinds [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15], to quark/gluon discrimination [16,17,18], and to full event classification [19,20,21]. The autoencoder learns to preferentially reconstruct the background, and still poorly reconstructs the signal, even though it sees the signal as part of the training process This raises the exciting possibility that the autoencoder could be trained directly on the data, and could potentially discover any anomalous signal of new physics in the background (perhaps when combined with other variables, for instance a mass cut or bump hunt), provided it looks different enough from Standard Model (SM) objects. This would be an ideal method to discover the unexpected or to perform open-ended searches for new physics at the LHC.

METHODS
Sample generation
Autoencoder architectures
TRAINING ON BACKGROUNDS
Choosing the latent dimension
Robustness with other Monte Carlo
Contamination study
Correlation with jet mass
Findings
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call