Abstract

We describe a novel application of the end-to-end deep learning technique to the task of discriminating top quark-initiated jets from those originating from the hadronization of a light quark or a gluon. The end-to-end deep learning technique combines deep learning algorithms and low-level detector representation of the high-energy collision event. In this study, we use lowlevel detector information from the simulated CMS Open Data samples to construct the top jet classifiers. To optimize classifier performance we progressively add low-level information from the CMS tracking detector, including pixel detector reconstructed hits and impact parameters, and demonstrate the value of additional tracking information even when no new spatial structures are added. Relying only on calorimeter energy deposits and reconstructed pixel detector hits, the end-to-end classifier achieves a ROC-AUC score of 0.975±0.002 for the task of classifying boosted top quark jets. After adding derived track quantities, the classifier ROC-AUC score increases to 0.9824±0.0013, serving as the first performance benchmark for these CMS Open Data samples.

Highlights

  • The Large Hadron Collider (LHC) is a prolific top quark factory: since the beginning of datataking in 2010, over 108 top quarks have been produced

  • When training the network on images composed of BPIX1–3, electromagnetic calorimeter (ECAL), and hardonic calorimeter (HCAL) layers we find that it outperforms the nominal combination of layers, shown in the second row of Table 4, and improves the area under the receiver operator curve (AUC) score by 0.008

  • With the exception of the final layer combination, where BPIX reconstructed hit (RecHit) are added to images composed of track pT + d0 + dZ + ECAL + HCAL information, we note that adding the RecHits gives a significant increase in network performance

Read more

Summary

Introduction

The Large Hadron Collider (LHC) is a prolific top quark factory: since the beginning of datataking in 2010, over 108 top quarks have been produced. At hadron colliders like the LHC, the low production cross section of prompt electrons and muons can be exploited to boost tagging efficiency when identifying top quarks with a leptonically decaying W-boson in its decay chain. Because of this, discriminating boosted top quark-jets from light flavour- or gluon-jets has become an important challenge for the LHC experiments, and a popular benchmark for data analysis techniques involving machine learning (ML) algorithms in high-energy physics (HEP). In previous work [13] we found that the track information was the leading contributor to the classifier’s performance Due to this insight and the importance of identifying displaced tracks associated with bottom quark decays, this new work introduces a number of key features from the CMS tracking detectors to exploit the full topology of hadronically decaying top quarks

Open Data Simulated Samples
Findings
Interpretation and Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call