Abstract

Data analysis in science, e.g., high-energy particle physics, is often subject to an intractable likelihood if the observables and observations span a high-dimensional input space. Typically the problem is solved by reducing the dimensionality using feature engineering and histograms, whereby the latter allows to build the likelihood using Poisson statistics. However, in the presence of systematic uncertainties represented by nuisance parameters in the likelihood, an optimal dimensionality reduction with a minimal loss of information about the parameters of interest is not known. This work presents a novel strategy to construct the dimensionality reduction with neural networks for feature engineering and a differential formulation of histograms so that the full workflow can be optimized with the result of the statistical inference, e.g., the variance of a parameter of interest, as objective. We discuss how this approach results in an estimate of the parameters of interest that is close to optimal and the applicability of the technique is demonstrated with a simple example based on pseudo-experiments and a more complex example from high-energy particle physics.

Highlights

  • Measurements in many areas of research like, e.g., highenergy particle physics, are typically based on the statistical inference of one or more parameters of interest defined by the likelihood L(D, ) with the observables x ∈ X ⊆ Rd building the dataset D = {x1, ..., xn} ⊆ Rn×d and the parameters of the statistical model

  • Given the assumption that the dimensionality reduction performed by the neural network (NN) together with the histogram is a sufficient statistic, the optimization can find a function for f that gives the best estimate for the parameter of interest, similar to a statistical inference performed on the initial high-dimensional dataset D with an unbinned likelihood

  • We have presented a novel approach to optimize statistical inference in the presence of systematic uncertainties, when using dimensionality reduction of the dataset and likelihoods based on Poisson statistics

Read more

Summary

Introduction

Measurements in many areas of research like, e.g., highenergy particle physics, are typically based on the statistical inference of one or more parameters of interest defined by the likelihood L(D, ) with the observables x ∈ X ⊆ Rd building the dataset D = {x1, ..., xn} ⊆ Rn×d and the parameters of the statistical model. The likelihood would have to be evaluated for the dataset D spanning a high-dimensional input space, which is computationally expensive and. Section “Application to a Simple Example Based on Pseudo-Experiments” shows the performance of the method with a simple example using pseudo-experiments of a two-component mixture model with signal and background and “Application to a More Complex Analysis Task Typical for High-Energy Particle Physics” applies the same approach to a more complex example from highenergy particle physics

Methods
Related Work
Page 4 of 11
Page 6 of 11
Nominal
Page 8 of 11
Page 10 of 11
Summary
Findings
Compliance with ethical standards
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.