Effective LHC measurements with matrix elements and machine learning

J Brehmer,F Kling,G Louppe,J Pavez,K Cranmer,I Espejo

doi:10.1088/1742-6596/1525/1/012022

Abstract

One major challenge for the legacy measurements at the LHC is that the likelihood function is not tractable when the collected data is high-dimensional and the detector response has to be modeled. We review how different analysis strategies solve this issue, including the traditional histogram approach used in most particle physics analyses, the Matrix Element Method, Optimal Observables, and modern techniques based on neural density estimation. We then discuss powerful new inference methods that use a combination of matrix element information and machine learning to accurately estimate the likelihood function. The MadMiner package automates all necessary data-processing steps. In first studies we find that these new techniques have the potential to substantially improve the sensitivity of the LHC legacy measurements.

Highlights

Large Hadron Collider (LHC) processes are most accurately described by a suite of complex computer simulations that describe parton density functions, hard process, parton shower, hadronization, detector response, sensor readout, and construction of observables with impressive precision
The likelihood function p(v(x)|θ) in the space of these summary statistics can be computed with simple density estimation techniques such as one-dimensional or two-dimensional histograms, kernel density estimation techniques, or Gaussian processes, and used instead of the likelihood function of the high-dimensional event data
While Monte-Carlo simulations provide an excellent description of this process, they do not allow us to explicitly calculate the corresponding likelihood function

Summary

Introduction

LHC processes are most accurately described by a suite of complex computer simulations that describe parton density functions, hard process, parton shower, hadronization, detector response, sensor readout, and construction of observables with impressive precision These tools take values of the parameters θ as input and use Monte-Carlo techniques to sample from the many different ways in which an event can develop, leading to simulated samples of observations x ∼ p(x|θ). P(x,z|θ) where zd are the variables characterizing the detector interactions in one simulated event, zs describes the parton shower and hadronization, and zp are the properties of the elementary particles in the hard interaction (four-momenta, helicities, charges, and flavours) These latent variables form an extremely high-dimensional space: with state-of-the-art simulators including Geant4 [5] for the detector simulation, one simulated event can involve tens of millions of random numbers! Phrasing particle physics measurements in this language allows us to tap into recent developments in these fields as well as in statistics and computer science

Methods

Discussion

Conclusion