Automating the ABCD method with machine learning

Gregor Kasieczka,David Shih,Matthew D Schwartz,Benjamin Nachman

doi:10.1103/physrevd.103.035021

Abstract

The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically-independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and physically motivated variables. Here, we explore the possibility of automating the design of one or both of these classifiers using machine learning. We show how to use state-of-the-art decorrelation methods to construct powerful yet independent discriminators. Along the way, we uncover a previously unappreciated aspect of the ABCD method: its accuracy hinges on having low signal contamination in control regions not just overall, but relative to the signal fraction in the signal region. We demonstrate the method with three examples: a simple model consisting of three-dimensional Gaussians; boosted hadronic top jet tagging; and a recasted search for paired dijet resonances. In all cases, automating the ABCD method with machine learning significantly improves performance in terms of ABCD closure, background rejection and signal contamination.

Highlights

A key component of high energy physics data analysis, whether for Standard Model (SM) measurements or searches beyond the SM, is background estimation
The idea behind all data-driven background estimation strategies is to extrapolate or interpolate from some control regions which are background dominated into a signal region of interest
We will show that single and double distance correlation (DisCo) improve the discrimination power and background closure of the ABCD method but can significantly reduce the level of signal contamination at the same time

Summary

INTRODUCTION

A key component of high energy physics data analysis, whether for Standard Model (SM) measurements or searches beyond the SM, is background estimation. The idea of the ABCD method is to pick two observables f and g (for example, the invariant mass of a dijet system and the rapidity of that system) which are approximately statistically independent for the background, and which are effective discriminators of signal versus background. Simple thresholds on these observables partition events into four regions. This simulation correction has small uncertainties—either because the effect itself is small, or because the correction is robust Such corrections, together with the fact that simple kinematic features are typically not optimal discriminants of signal versus background, generally limit the effectiveness of the ABCD method and the sensitivity of the analysis in question.

THE ABCD METHOD

AUTOMATING THE ABCD METHOD

Simple example

Boosted tops

RPV SUSY

Findings

DISCUSSION

CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Physical Review D	Publication Date: Feb 22, 2021
Citations: 35	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Automating the ABCD method with machine learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Physical Review D

Lead the way for us

Similar Papers

Search for Low-Mass Dijet Resonances Using Trigger-Level Jets with the ATLAS Detector in pp Collisions at sqrt[s]=13 TeV.
M Aaboud ...
Physical Review Letters | VOL. 121
M Aaboud, et. al.M Aaboud ...
22 Aug 2018
Physical Review Letters | VOL. 121

Machine Learning as a Service for High Energy Physics on heterogeneous computing resources
Luca Giommi ... Valentin Kuznetsov
-
Luca Giommi, et. al.Luca Giommi ... Valentin Kuznetsov
22 Oct 2021
22 Oct 2021

Machine learning: how to get more out of HEP data and the Higgs Boson Machine Learning Challenge
Marcin Wolter
-
Marcin WolterMarcin Wolter
11 Sep 2015
11 Sep 2015

Search for dijet resonances in events with an isolated charged lepton using sqrt{s} = 13 TeV proton-proton collision data collected by the ATLAS detector
A A Affolder ...
Journal of High Energy Physics | VOL. 2020
A A Affolder, et. al.A A Affolder ...
01 Jun 2020
Journal of High Energy Physics | VOL. 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automating the ABCD method with machine learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Physical Review D