The text {t}bar{text {t}}text {H}(text {b}bar{text {b}}) process is an essential channel in revealing the Higgs boson properties; however, its final state has an irreducible background from the text {t}bar{text {t}}text {b}bar{text {b}} process, which produces a top quark pair in association with a b quark pair. Therefore, understanding the text {t}bar{text {t}}text {b}bar{text {b}} process is crucial for improving the sensitivity of a search for the text {t}bar{text {t}}text {H}(text {b}bar{text {b}}) process. To this end, when measuring the differential cross section of the text {t}bar{text {t}}text {b}bar{text {b}} process, we need to distinguish the b-jets originating from top quark decays and additional b-jets originating from gluon splitting. In this paper, we train deep neural networks that identify the additional b-jets in the {text {t}}{bar{text {t}}}{text {b}}{bar{text {b}}} events under the supervision of a simulated text{t}bar{text{t}}text{b}bar{text{b}} event data set in which true additional b-jets are indicated. By exploiting the special structure of the text {t}bar{text {t}}text {b}bar{text {b}} event data, several loss functions are proposed and minimized to directly increase matching efficiency, i.e., the accuracy of identifying additional b-jets. We show that, via a proof-of-concept experiment using synthetic data, our method can be more advantageous for improving matching efficiency than the deep learning-based binary classification approach presented in [1]. Based on simulated text {t}bar{text {t}}text {b}bar{text {b}} event data in the lepton+jets channel from pp collision at sqrt{s} = 13 TeV, we then verify that our method can identify additional b-jets more accurately: compared with the approach in [1], the matching efficiency improves from 62.1% to 64.5% and from 59.9% to 61.7% for the leading order and the next-to-leading order simulations, respectively.
Read full abstract