Abstract

Optimal transport theory has recently found many applications in machine learning thanks to its capacity to meaningfully compare various machine learning objects that are viewed as distributions. The Kantorovitch formulation, leading to the Wasserstein distance, focuses on the features of the elements of the objects, but treats them independently, whereas the Gromov–Wasserstein distance focuses on the relations between the elements, depicting the structure of the object, yet discarding its features. In this paper, we study the Fused Gromov-Wasserstein distance that extends the Wasserstein and Gromov–Wasserstein distances in order to encode simultaneously both the feature and structure information. We provide the mathematical framework for this distance in the continuous setting, prove its metric and interpolation properties, and provide a concentration result for the convergence of finite samples. We also illustrate and interpret its use in various applications, where structured objects are involved.

Highlights

  • We focus on the comparison of structured objects, i.e., objects defined by both a feature and a structure information

  • We prove that the Fused Gromov–Wasserstein (FGW) distance is a distance regarding the equivalence relation between structured objects, as defined in Defintion 7, allowing us to derive a topology on S(Ω)

  • This paper presents the Fused Gromov–Wasserstein (FGW) distance

Read more

Summary

Introduction

We focus on the comparison of structured objects, i.e., objects defined by both a feature and a structure information. The Optimal Transport (OT) framework defines distances between probability measures that describe either the feature or the structure information of structured objects. In the computer vision community) between distributions μ A and νB It defines a distance on probability measures, especially dW,p (μ A , νB ) = 0 iff μ A = νB. This distance has a nice geometrical interpretation as it represents an optimal cost (w.r.t. d) to move the measure μ A onto νB with π ( a, b) the amount of probability mass shifted from a to b (see Figure 5) To this extent, the Wasserstein distance quantifies how “far” μ A is from νB by measuring how “difficult” it is to move all the mass from μ A onto νB. Optimal transport can deal with smooth and discrete measures and it has proved to be very useful for comparing distributions in a shared space, but with different (and even non-overlapping) supports

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.