Monte Carlo (MC) generators are crucial for analyzing data in particle collider experiments. However, often even a small mismatch between the MC simulations and the measurements can undermine the interpretation of the results. This is particularly important in the context of LHC searches for rare physics processes within and beyond the standard model (SM). One of the ultimate rare processes in the SM currently being explored at the LHC, $pp\to t\bar tt \bar t$ with its large multi-dimensional phase-space is an ideal testing ground to explore new ways to reduce the impact of potential MC mismodelling on experimental results. We propose a novel statistical method capable of disentangling the 4-top signal from the dominant backgrounds in the same-sign dilepton channel, while simultaneously correcting for possible MC imperfections in modelling of the most relevant discriminating observables -- the jet multiplicity distributions. A Bayesian mixture of multinomials is used to model the light-jet and $b$-jet multiplicities under the assumption of their conditional independence. The signal and background distributions generated from a deliberately mistuned MC simulator are used as model priors. The posterior distributions, as well as the signal and background fractions, are then learned from the data using Bayesian inference. We demonstrate that our method can mitigate the effects of large MC mismodellings in the context of a realistic $t\bar tt\bar t$ search, leading to corrected posterior distributions that better approximate the underlying truth-level spectra.
Read full abstract