In machine learning, optimal transport (OT) theory is extensively utilized to compare probability distributions across various applications, such as graph data represented by node distributions and image data represented by pixel distributions. In practical scenarios, it is often necessary to solve multiple OT problems. Traditionally, these problems are treated independently, with each OT problem being solved sequentially. However, the computational complexity required to solve a single OT problem is already substantial, making the resolution of multiple OT problems even more challenging. Although many applications of fast solutions to OT are based on the premise of a single OT problem with arbitrary distributions, few efforts handle such multiple OT problems with multiple distributions. Therefore, we propose the anchor space OT (ASOT) problem: an approximate OT problem designed for multiple OT problems. This proposal stems from our finding that in many tasks the mass transport tends to be concentrated in a reduced space from the original feature space. By restricting the mass transport to a learned anchor point space, ASOT avoids pairwise instantiations of cost matrices for multiple OT problems and simplifies the problems by canceling insignificant transports. This simplification greatly reduces its computational costs. We then prove the upper bounds of its 1 -Wasserstein distance error between the proposed ASOT and the original OT problem under different conditions. Building upon this accomplishment, we propose three methods to learn anchor spaces for reducing the approximation error. Furthermore, our proposed methods present great advantages for handling distributions of different sizes with GPU parallelization.
Read full abstract