Abstract

Beginning from a basic neural-network architecture, we test the potential benefits offered by a range of advanced techniques for machine learning, in particular deep learning, in the context of a typical classification problem encountered in the domain of high-energy physics, using a well-studied dataset: the 2014 Higgs ML Kaggle dataset. The advantages are evaluated in terms of both performance metrics and the time required to train and apply the resulting models. Techniques examined include domain-specific data-augmentation, learning rate and momentum scheduling, (advanced) ensembling in both model-space and weight-space, and alternative architectures and connection methods.Following the investigation, we arrive at a model which achieves equal performance to the winning solution of the original Kaggle challenge, whilst being significantly quicker to train and apply, and being suitable for use with both GPU and CPU hardware setups. These reductions in timing and hardware requirements potentially allow the use of more powerful algorithms in HEP analyses, where models must be retrained frequently, sometimes at short notice, by small groups of researchers with limited hardware resources. Additionally, a new wrapper library for PYTORCH called LUMIN is presented, which incorporates all of the techniques studied.

Highlights

  • The rise of machine-learning (ML) applications has had a remarkable impact on many areas of industry and in a few academic disciplines

  • The field of experimental high-energy physics (HEP), was comparatively slower to adopt these new approaches, with only the occasional and isolated use of basic multivariate analyses (MVAs), such as b-jet tagging at LEP e.g. [1] in 1995, and Dø’s observation of single-top-quark production [2] in 2001; in the latter, the authors explicitly noted the importance of their MVA in fully utilising all available data to make the observation

  • This process of adoption continued at the Large Hadron Collider (LHC) with analyses such as Ref. [6] (2011), which performed two complimentary measurements: one using physics-derived variables, and the other using an MVA (a boosted decision tree (BDT))

Read more

Summary

Introduction

The rise of machine-learning (ML) applications has had a remarkable impact on many areas of industry and in a few academic disciplines. [3] (2003), which notes ”The best use of data is ensured only with a multivariate treatment” when discussing approaches for data-analysis by the CDF [4] and Dø [5] collaborations during Run-II of the Tevatron (2001-2011) This process of adoption continued at the Large Hadron Collider (LHC) with analyses such as Ref. A notable example of this came in 2012 with the concerted use of no less than four MVAs (BDTs) in a single analysis: the search for Higgs boson decays to pairs of photons performed by the CMS collaboration at the LHC [7] - which contributed significantly to the discovery of the Higgs boson by ATLAS [8] and CMS [9] This process of MVA integration accompanied a shift in the community’s trust of approaches which relied less on expert knowledge

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.