How to Use Machine Learning to Improve the Discrimination between Signal and Background at Particle Colliders

Xabier Cid Cid Vidal,Lorena Dieste Dieste Maroñas,Álvaro Dosil Dosil Suárez

doi:10.3390/app112211076

Xabier Cid Cid Vidal, Lorena Dieste Dieste Maroñas + Show 1 more

Open Access

https://doi.org/10.3390/app112211076

Copy DOI

Journal: Applied sciences	Publication Date: Nov 22, 2021
Citations: 4	License type: CC BY 4.0

Affiliation: University of Santiago de Compostela

Abstract

The popularity of Machine Learning (ML) has been increasing in recent decades in almost every area, with the commercial and scientific fields being the most notorious ones. In particle physics, ML has been proven a useful resource to make the most of projects such as the Large Hadron Collider (LHC). The main advantage provided by ML is a reduction in the time and effort required for the measurements carried out by experiments, and improvements in the performance. With this work we aim to encourage scientists working with particle colliders to use ML and to try the different alternatives that are available, focusing on the separation of signal and background. We assess some of the most-used libraries in the field, such as Toolkit for Multivariate Data Analysis with ROOT, and also newer and more sophisticated options such as PyTorch and Keras. We also assess the suitability of some of the most common algorithms for signal-background discrimination, such as Boosted Decision Trees, and propose the use of others, namely Neural Networks. We compare the overall performance of different algorithms and libraries in simulated LHC data and produce some guidelines to help analysts deal with different situations. Examples include the use of low or high-level features from particle detectors or the amount of statistics that are available for training the algorithms. Our main conclusion is that the algorithms and libraries used more frequently at LHC collaborations might not always be those that provide the best results for the classification of signal candidates, and fully connected Neural Networks trained with Keras can improve the performance scores in most of the cases we formulate.

Highlights

IntroductionEspecially those at particle colliders, have to deal with vast amounts of data where, very often, an elusive signal must be found against a much larger background
Particle physics experiments, and especially those at particle colliders, have to deal with vast amounts of data where, very often, an elusive signal must be found against a much larger background
Given that we use LHCb as the benchmark for our studies, we focused on several B meson decay modes that were studied in the experiment

Summary

Introduction

Especially those at particle colliders, have to deal with vast amounts of data where, very often, an elusive signal must be found against a much larger background. This has naturally paved the way for the usage of Machine. As in other problems where ML applies, the separation between signal and background relies on several variables (features) that behave differently in both categories. The use of ML in particle physics is an emerging area, which is extending to more and more fields, as we shall see. A very complete (and live) review of this wide range of uses can be found in Reference [1]

Objectives

Results

Conclusion