Abstract
Reeb spaces, as well as their discretized versions called Mappers, are common descriptors used in topological data analysis, with plenty of applications in various fields of science, such as computational biology and data visualization, among others. The stability and quantification of the rate of convergence of the Mapper to the Reeb space has been studied a lot in recent works (Brown et al. in CoRR. arXiv:1909.03488 , 2019; Carrière and Oudot in Found Comput Math 18(6):1333–1396, 2017; Carrière et al. in J Mach Learn Res 19(12):1–39, 2018; Munch and Wang in: 32nd international symposium on computational geometry (SoCG 2016), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 51: 53:1–53:16, 2016), focusing on the case where a scalar-valued filter is used for the computation of Mapper. On the other hand, much less is known in the multivariate case, when the codomain of the filter is $${\mathbb {R}}^p$$ , and in the general case, when it is a general metric space $$(\mathcal {Z},d_\mathcal {Z})$$ , instead of $${\mathbb {R}}$$ . The few results that are available in this setting (Dey et al. in: 33rd international symposium on computational geometry (SoCG 2017), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 77, 36:1–36:16, 2017; Munch and Wang, 2016) can only handle continuous topological spaces and cannot be used as is for finite metric spaces representing data, such as point clouds and distance matrices. In this article, we introduce a slight modification of the usual Mapper construction and we give risk bounds for estimating the Reeb space using this estimator. Our approach applies in particular to the setting where the filter function used to compute Mapper is also estimated from data, such as the eigenfunctions of PCA. Our results are given with respect to the Gromov-Hausdorff distance, computed with specific filter-based pseudometrics for Mappers and Reeb spaces defined in Dey et al. (2017). We finally provide examples of this setting in statistics and machine learning for different kinds of target filters, as well as numerical experiments that demonstrate the relevance of our approach.
Submitted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have