Wavelet $$s$$-Wasserstein distances for $$0 < s\leqslant\,1$$
Abstract Motivated by classical harmonic analysis results characterizing Hölder spaces in terms of the decay of their wavelet coefficients, we consider wavelet methods for computing $$s$$ -Wasserstein type distances. Previous work by Sheory (né Shirdhonkar) and Jacobs showed that, for $$0 < s\leqslant1$$ , the $$s$$ -Wasserstein distance $$W_s$$ between certain probability measures on Euclidean space is equivalent to a weighted $$\ell^1$$ difference of their wavelet coefficients. We demonstrate that the original statement of this equivalence is incorrect in a few aspects and, furthermore, fails to capture key properties of the $$W_s$$ distance, such as its behavior under translations of probability measures. Inspired by this, we consider a variant of the previous wavelet distance formula for which equivalence (up to an arbitrarily small error) does hold for $$0 < s < 1$$ . We analyze the properties of this distance, one of which is that it provides a natural embedding of the $$s$$ -Wasserstein space into a linear space. We conclude with several numerical simulations. Even though our theoretical result merely ensures that the new wavelet $$s$$ -Wasserstein distance is equivalent to the classical $$W_s$$ distance (up to an error), our numerical simulations show that the new wavelet distance succeeds in capturing the behavior of the exact $$W_s$$ distance under translations and dilations of probability measures.
- Research Article
14
- 10.1090/s0002-9939-2011-10891-8
- Feb 28, 2011
- Proceedings of the American Mathematical Society
We prove that in the Wasserstein space built over R the subset of measures that does not charge the non-differentiability set of convex functions is not displacement convex. This completes the study of Gigli on the geometric structure of measures meeting the sharp hypothesis of the refined version of Brenier’s Theorem. Optimal transport is nowadays a central tool in many fields of analysis, differential geometry and probability theory (see e.g. the books by Villani [11, 12]). It takes its origin in the problem of Monge, asking for the shortest way to displace an amount of soil from one place of the Euclidean space to a heap of soil at another place. This problem induces a very natural distance between probability measures – the Wasserstein distance – where the measures represent the heaps of soil. Wasserstein distance is somewhat complementary with the Lebesgue L norms because it corresponds to an “horizontal” displacement: in first approximation, for two localized probability measures on R, the Wasserstein distance is the distance between their barycenters. Brenier’s Theorem [4] on monotone rearrangement of maps of R has become the very core of the theory of optimal transport. It gives a representation of the optimal transport map in term of gradient of convex functions. A very enlightening heuristic on (P2(R ),W2) is proposed in [7] where it appears with an infinite differential structure and the Wasserstein distance is seen as a Riemannian-like distance. This point of view has raised a lot of developments, among which the gradient flow theory presented in [2]. In [6], Gigli explores the Riemannian-like structure and proposes to think of the measures meeting the hypothesis of a refined version of Brenier’s Theorem as the regular points of the Wasserstein space. In this paper we show that the set of those transport-regular measures is not geodesically convex (Theorems 2.2 and 2.5). It is quite surprising because it is well-known that the subset made of absolutely continuous measures (the most notorious transport-regular measures) is geodesically convex. It answers a question suggested by Gigli [6, Remark 2.12]. The paper is built as follows: we first recall some main results on the quadratic Monge(-Kantorovich) problem. The key idea of our result can be found in Lemma 1.6 while Proposition 1.8 gives a way, together with Proposition 1.7, to characterize transport-regular measures. In the second (and last) part, we prove the main theorem (Theorem 2.2) and its generalization (Theorem 2.5) thanks to explicit constructions. 2010 Mathematics Subject Classification. Primary 28A75.
- Research Article
1
- 10.1007/s12652-023-04640-7
- May 22, 2023
- Journal of Ambient Intelligence and Humanized Computing
Bayesian optimization (BO) based on the Gaussian process model (GP-BO) has become the most used approach for the global optimization of black-box functions and computationally expensive optimization problems. BO has proved its sample efficiency and its versatility in a wide range of engineering and machine learning problems. A limiting factor in its applications is the difficulty of scaling over 15–20 dimensions. In order to mitigate this drawback, it has been remarked that optimization problems can have a lower intrinsic dimensionality. Several optimization strategies, built on this observation, map the original problem into a lower dimension manifold. In this paper we take a novel approach mapping the original problem into a space of discrete probability distributions endowed with a Wasserstein metric. The Wasserstein space is a non-linear manifold whose elements are discrete probability distributions. The input of the Gaussian process is given by discrete probability distributions and the acquisition function becomes a functional in the Wasserstein space. The minimizer of the acquisition functional in the Wasserstein space is then mapped back to the original space using a neural network. Computational results for three test functions with dimensionality ranging from 5 to 100, show that the exploration in the Wasserstein space is significantly more effective than that performed by plain Bayesian optimization in the Euclidean space and its advantage grows with the dimensions of the search space.
- Research Article
25
- 10.1007/978-3-319-19992-4_32
- Jan 1, 2015
- Information processing in medical imaging : proceedings of the ... conference
Brain morphometry study plays a fundamental role in medical imaging analysis and diagnosis. This work proposes a novel framework for brain cortical surface classification using Wasserstein distance, based on uniformization theory and Riemannian optimal mass transport theory. By Poincare uniformization theorem, all shapes can be conformally deformed to one of the three canonical spaces: the unit sphere, the Euclidean plane or the hyperbolic plane. The uniformization map will distort the surface area elements. The area-distortion factor gives a probability measure on the canonical uniformization space. All the probability measures on a Riemannian manifold form the Wasserstein space. Given any 2 probability measures, there is a unique optimal mass transport map between them, the transportation cost defines the Wasserstein distance between them. Wasserstein distance gives a Riemannian metric for the Wasserstein space. It intrinsically measures the dissimilarities between shapes and thus has the potential for shape classification. To the best of our knowledge, this is the first. work to introduce the optimal mass transport map to general Riemannian manifolds. The method is based on geodesic power Voronoi diagram. Comparing to the conventional methods, our approach solely depends on Riemannian metrics and is invariant under rigid motions and scalings, thus it intrinsically measures shape distance. Experimental results on classifying brain cortical surfaces with different intelligence quotients demonstrated the efficiency and efficacy of our method.
- Research Article
- 10.1080/10618600.2024.2404708
- Nov 9, 2024
- Journal of Computational and Graphical Statistics
Survival analysis plays a pivotal role in medical research, offering valuable insights into the timing of events such as survival time. One common challenge in survival analysis is the necessity to adjust the survival function to account for additional factors, such as age, gender, and ethnicity. We propose an innovative regression model for right-censored survival data across heterogeneous populations, leveraging the Wasserstein space of probability measures. Our approach models the probability measure of survival time and the corresponding nonparametric Kaplan-Meier estimator for each subgroup as elements of the Wasserstein space. The Wasserstein space provides a flexible framework for modeling heterogeneous populations, allowing us to capture complex relationships between covariates and survival times. We address an underexplored aspect by deriving the non-asymptotic convergence rate of the Kaplan-Meier estimator to the underlying probability measure in terms of the Wasserstein metric. The proposed model is supported with a solid theoretical foundation including pointwise and uniform convergence rates, along with an efficient algorithm for model fitting. The proposed model effectively accommodates random variation that may exist in the probability measures across different subgroups, demonstrating superior performance in both simulations and two case studies compared to the Cox proportional hazards model and other alternative models. Supplementary materials for this article are available online.
- Research Article
- 10.1108/ajeb-10-2023-0099
- Dec 15, 2023
- Asian Journal of Economics and Banking
PurposeThis paper aims mainly at introducing applied statisticians and econometricians to the current research methodology with non-Euclidean data sets. Specifically, it provides the basis and rationale for statistics in Wasserstein space, where the metric on probability measures is taken as a Wasserstein metric arising from optimal transport theory.Design/methodology/approachThe authors spell out the basis and rationale for using Wasserstein metrics on the data space of (random) probability measures.FindingsIn elaborating the new statistical analysis of non-Euclidean data sets, the paper illustrates the generalization of traditional aspects of statistical inference following Frechet's program.Originality/valueBesides the elaboration of research methodology for a new data analysis, the paper discusses the applications of Wasserstein metrics to the robustness of financial risk measures.
- Book Chapter
- 10.1007/978-3-319-00227-9_9
- Jan 1, 2014
This chapter is a brief investigation of the links between optimal transportation methods and functional inequalities in the Markov operator framework of this monograph. After a brief introduction to the basic material on optimal transportation, the main topic of transportation cost inequalities and first examples for Gaussian measures are presented. Interpolation along the geodesics of optimal transport is used towards logarithmic Sobolev inequalities and transportation cost inequalities comparing relative entropy and Wasserstein distances between probability measures. An alternate approach to sharp Sobolev or Gagliardo–Nirenberg inequalities in Euclidean space is provided next along these lines. Non-linear Hamilton–Jacobi equations and hypercontractivity properties of their solutions, analogous to the ones for linear heat equations, are investigated in the further sections towards the relationships between (quadratic) transportation cost inequalities and logarithmic Sobolev inequalities. Contraction properties in Wasserstein space along with the heat semigroup are investigated in the Markov operator setting. The last section is a very brief overview of recent developments towards a notion of Ricci curvature lower bounds based on optimal transportation and the connection with the Γ-calculus developed in this work.
- Preprint Article
- 10.5194/egusphere-egu21-12708
- Mar 4, 2021
&lt;p&gt;The issue of vulnerability and robustness in networked systems can be addressed by several methods. The most widely used are based on a set of centrality and connectivity measures from network theory which basically relate vulnerability to the loss of efficiency caused by the removal of some nodes and edges. Another related view is given by the analysis of the spectra of the adjacency and Laplacian matrices of the graph associated to the networked system.&lt;/p&gt;&lt;p&gt;The main contribution of this paper is the introduction of a new set of vulnerability metrics given by the distance between the probability distribution of node-node distances between the original network and that resulting from the removal of nodes/edges. Two such probabilistic measures have been analysed: Jensen-Shannon (JS) divergence and Wasserstein (WST) distance, aka the Earth-Mover distance: this name comes from its informal interpretation as the minimum energy cost of moving and transforming a pile of dirt in the shape of one probability distribution to the shape of the other distribution. The cost is quantified by the amount of dirt moved times the moving distance. The Wasserstein distance can be traced back to the works of Gaspard Monge in 1761 and Lev Kantorovich in 1942. Wasserstein distances are generally well defined and provide an interpretable distance metric between distributions. Computing Wasserstein distances requires in general the solution of a constrained linear optimization problem which is, when the support of the probability distributions is multidimensional, very large.&lt;/p&gt;&lt;p&gt;An advantage of the Wasserstein distance is that, under quite general conditions, it is a differentiable function of the parameters of the distributions which makes possible its use to assess the sensitivity of the network robustness to distributional perturbations. The computational results related to two real-life water distribution networks confirm that the value of the distances JS and WST is strongly related to the criticality of the removed edges. Both are more discriminating, at least for water distribution networks, than efficiency-based and spectral measures. A general methodological scheme has been developed connecting different modelling and computational elements, concepts and analysis tools, to create an analysis framework suitable for analysing robustness. This modelling and algorithmic framework can also support the analysis of other networked infrastructures among which power grids, gas distribution and transit networks.&lt;/p&gt;
- Research Article
- 10.1007/s10898-025-01463-y
- Jan 11, 2025
- Journal of Global Optimization
Gaussian Process regression is a kernel method successfully adopted in many real-life applications. Recently, there is a growing interest on extending this method to non-Euclidean input spaces, like the one considered in this paper, consisting of probability measures. Although a Positive Definite kernel can be defined by using a suitable distance—the Wasserstein distance— the common procedure for learning the Gaussian Process model can fail due to numerical issues, arising earlier and more frequently than in the case of an Euclidean input space and, as demonstrated, impossible to avoid by adding artificial noise (nugget effect) as usually done. This paper uncovers the main reason of these issues, that is a non-stationarity relation between the Wasserstein-based squared exponential kernel and its Euclidean counterpart. As a relevant result, we learn a Gaussian Process model by assuming the input space as Euclidean and then use an algebraic transformation, based on the uncovered relation, to transform it into a non-stationary and Wasserstein-based Gaussian Process model over probability measures. This algebraic transformation is simpler than log-exp maps used on data belonging to Riemannian manifolds and recently extended to consider the pseudo-Riemannian structure of an input space equipped with the Wasserstein distance.
- Research Article
5
- 10.1007/s41884-021-00063-5
- Nov 15, 2021
- Information Geometry
We study Bregman divergences in probability density space embedded with the $$L^2$$ –Wasserstein metric. Several properties and dualities of transport Bregman divergences are provided. In particular, we derive the transport Kullback–Leibler (KL) divergence by a Bregman divergence of negative Boltzmann–Shannon entropy in $$L^2$$ –Wasserstein space. We also derive analytical formulas and generalizations of transport KL divergence for one-dimensional probability densities and Gaussian families.
- Research Article
149
- 10.1002/mana.19901470121
- Jan 1, 1990
- Mathematische Nachrichten
For a separable metric space (X, d) Lp Wasserstein metrics between probability measures μ and v on X are defined by where the infimum is taken over all probability measures η on X × X with marginal distributions μ and v, respectively. After mentioning some basic properties of these metrics as well as explicit formulae for X = R a formula for the L2 Wasserstein metric with X = Rn will be cited from [5], [9], and [21] and proved for any two probability measures of a family of elliptically contoured distributions.Finally this result will be generalized for Gaussian measures to the case of a separable Hilbert space.
- Research Article
94
- 10.1214/15-aihp706
- Feb 1, 2017
- Annales de l'Institut Henri Poincaré, Probabilités et Statistiques
Geodesic PCA in the Wasserstein space by convex PCA
- Research Article
59
- 10.1016/j.aim.2016.11.026
- Nov 29, 2016
- Advances in Mathematics
Wasserstein barycenters over Riemannian manifolds
- Research Article
1
- 10.1080/00401706.2023.2174602
- Feb 22, 2023
- Technometrics
Nowadays stochastic computer simulations with both numeral and distributional inputs are widely used to mimic complex systems which contain a great deal of uncertainty. This article studies the design and analysis issues of such computer experiments. First, we provide preliminary results concerning the Wasserstein distance in probability measure spaces. To handle the product space of the Euclidean space and the probability measure space, we prove that, through the mapping from a point in the Euclidean space to the mass probability measure at this point, the Euclidean space can be isomorphic to the subset of the probability measure space, which consists of all the mass measures, with respect to the Wasserstein distance. Therefore, the product space can be viewed as a product probability measure space. We derive formulas of the Wasserstein distance between two components of this product probability measure space. Second, we use the above results to construct Wasserstein distance-based space-filling criteria in the product space of the Euclidean space and the probability measure space. A class of optimal Latin hypercube-type designs in this product space are proposed. Third, we present a Wasserstein distance-based Gaussian process model to analyze data from computer experiments with both numeral and distributional inputs. Numerical examples and real applications to a metro simulation are presented to show the effectiveness of our methods.
- Research Article
- 10.1093/imaiai/iaad027
- Apr 27, 2023
- Information and Inference: A Journal of the IMA
We study the $k$-nearest neighbour classifier ($k$-NN) of probability measures under the Wasserstein distance. We show that the $k$-NN classifier is not universally consistent on the space of measures supported in $(0,1)$. As any Euclidean ball contains a copy of $(0,1)$, one should not expect to obtain universal consistency without some restriction on the base metric space, or the Wasserstein space itself. To this end, via the notion of $\sigma $-finite metric dimension, we show that the $k$-NN classifier is universally consistent on spaces of discrete measures (and more generally, $\sigma $-finite uniformly discrete measures) with rational mass. In addition, by studying the geodesic structures of the Wasserstein spaces for $p=1$ and $p=2$, we show that the $k$-NN classifier is universally consistent on spaces of measures supported on a finite set, the space of Gaussian measures and spaces of measures with finite wavelet series densities.
- Research Article
82
- 10.1007/s00205-016-1026-7
- Jul 8, 2016
- Archive for Rational Mechanics and Analysis
The Wasserstein distances Wp (p \({\geqq}\) 1), defined in terms of a solution to the Monge–Kantorovich problem, are known to be a useful tool to investigate transport equations. In particular, the Benamou–Brenier formula characterizes the square of the Wasserstein distance W2 as the infimum of the kinetic energy, or action functional, of all vector fields transporting one measure to the other. Another important property of the Wasserstein distances is the Kantorovich–Rubinstein duality, stating the equality between the distance W1(μ, ν) of two probability measures μ, ν and the supremum of the integrals in d(μ − ν) of Lipschitz continuous functions with Lipschitz constant bounded by one. An intrinsic limitation of Wasserstein distances is the fact that they are defined only between measures having the same mass. To overcome such a limitation, we recently introduced the generalized Wasserstein distances \({W_p^{a,b}}\), defined in terms of both the classical Wasserstein distance Wp and the total variation (or L1) distance, see (Piccoli and Rossi in Archive for Rational Mechanics and Analysis 211(1):335–358, 2014). Here p plays the same role as for the classic Wasserstein distance, while a and b are weights for the transport and the total variation term. In this paper we prove two important properties of the generalized Wasserstein distances: (1) a generalized Benamou–Brenier formula providing the equality between \({W_2^{a,b}}\) and the supremum of an action functional, which includes a transport term (kinetic energy) and a source term; (2) a duality a la Kantorovich–Rubinstein establishing the equality between \({W_1^{1,1}}\) and the flat metric.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.