Abstract
Optimal transport theory has recently found many applications in machine learning thanks to its capacity to meaningfully compare various machine learning objects that are viewed as distributions. The Kantorovitch formulation, leading to the Wasserstein distance, focuses on the features of the elements of the objects, but treats them independently, whereas the Gromov–Wasserstein distance focuses on the relations between the elements, depicting the structure of the object, yet discarding its features. In this paper, we study the Fused Gromov-Wasserstein distance that extends the Wasserstein and Gromov–Wasserstein distances in order to encode simultaneously both the feature and structure information. We provide the mathematical framework for this distance in the continuous setting, prove its metric and interpolation properties, and provide a concentration result for the convergence of finite samples. We also illustrate and interpret its use in various applications, where structured objects are involved.
Highlights
We focus on the comparison of structured objects, i.e., objects defined by both a feature and a structure information
We prove that the Fused Gromov–Wasserstein (FGW) distance is a distance regarding the equivalence relation between structured objects, as defined in Defintion 7, allowing us to derive a topology on S(Ω)
This paper presents the Fused Gromov–Wasserstein (FGW) distance
Summary
We focus on the comparison of structured objects, i.e., objects defined by both a feature and a structure information. The Optimal Transport (OT) framework defines distances between probability measures that describe either the feature or the structure information of structured objects. In the computer vision community) between distributions μ A and νB It defines a distance on probability measures, especially dW,p (μ A , νB ) = 0 iff μ A = νB. This distance has a nice geometrical interpretation as it represents an optimal cost (w.r.t. d) to move the measure μ A onto νB with π ( a, b) the amount of probability mass shifted from a to b (see Figure 5) To this extent, the Wasserstein distance quantifies how “far” μ A is from νB by measuring how “difficult” it is to move all the mass from μ A onto νB. Optimal transport can deal with smooth and discrete measures and it has proved to be very useful for comparing distributions in a shared space, but with different (and even non-overlapping) supports
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.