Abstract

We introduce an efficient framework for computing the distance between collider events using the tools of Linearized Optimal Transport (LOT). This preserves many of the advantages of the recently-introduced Energy Mover's Distance, which quantifies the "work" required to rearrange one event into another, while significantly reducing the computational cost. It also furnishes a Euclidean embedding amenable to simple machine learning algorithms and visualization techniques, which we demonstrate in a variety of jet tagging examples. The LOT approximation lowers the threshold for diverse applications of the theory of optimal transport to collider physics.

Highlights

  • What is the distance between collider events? This question, simple to pose, is notoriously difficult to answer

  • To the extent that the 2-Wasserstein distance has a pseudo-Riemannian structure, the Linearized Optimal Transport (LOT) approximation amounts to projecting onto the 2-Wasserstein tangent plane at a chosen reference event and computing simpler l2 distances on that plane. We make this point of view rigorous in the Appendix, where we prove that, as the reference event in the LOT approximation is refined, LOT converges to the distance between events on the tangent plane, which provides a well-defined metric on the space of events

  • IV, where we explore the performance of linear discriminate analysis (LDA), k-nearest neighbor, support vector machine (SVM), and k-medoids clustering algorithms in the pairwise classification of boosted QCD, W, t, Higgs, and beyond-Standard Model (BSM) jets

Read more

Summary

INTRODUCTION

What is the distance between collider events? This question, simple to pose, is notoriously difficult to answer. One of the major practical challenges to the use of EMD in analyzing collider events is the computational cost; for a dataset containing Nevt events, computing the pairwise distance between all events is OðN2evtÞ.. We define an efficient framework for computing the distance between collider events by applying the tools of Linearized Optimal Transport (LOT), preserving the many advantages of the EMD while significantly reducing the computational cost and furnishing a Euclidean embedding suitable for use in a wide range of ML algorithms. A proof of the convergence of the LOT approximation to a true metric in the continuum limit is reserved for the Appendix

LINEARIZED OPTIMAL TRANSPORT
OBJECT CLASSIFICATION WITH LOT
MACHINE LEARNING WITH LOT
CONCLUSION
E Þ Z ðx2
Findings
E Þ: ðA6Þ
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call