Abstract

Topological data analysis and its main method, persistent homology, provide a toolkit for computing topological information of high-dimensional and noisy data sets. Kernels for one-parameter persistent homology have been established to connect persistent homology with machine learning techniques with applicability on shape analysis, recognition and classification. We contribute a kernel construction for multi-parameter persistence by integrating a one-parameter kernel weighted along straight lines. We prove that our kernel is stable and efficiently computable, which establishes a theoretical connection between topological data analysis and machine learning for multivariate data analysis.

Highlights

  • Topological data analysis (TDA) is an active area in data science with a growing interest and notable successes in a number of applications in science and engineering [1,2,3,4,5,6,7,8]

  • TDA employs the mathematical notion of simplicial complexes [14] to encode higher order interactions in the system, and at its core uses the computational framework of persistent homology [15,16,17,18,19] to extract multi-scale topological features of the data

  • TDA extracts a rich set of topological features from highdimensional and noisy data sets that complement geometric and statistical features, which offers a different perspective for machine learning

Read more

Summary

Introduction

Topological data analysis (TDA) is an active area in data science with a growing interest and notable successes in a number of applications in science and engineering [1,2,3,4,5,6,7,8]. We establish, for the first time, a theoretical connection between topological features and machine learning algorithms via the kernel approach for multi-parameter persistent homology. Such a theoretical underpinning is necessary for applications in multivariate data analysis. Attractive alternatives are (multi-parameter) bottleneck distance [41] and the matching distance [42,43], which compares the persistence diagrams along all slices (appropriately weighted) and picks the worst discrepancy as the distance of the bi-filtrations This distance can be approximated up to a precision using an appropriate subsample of the lines [42], and computed exactly in polynomial time [43]. The software library RIVET [44] provides a visualization tool to explore bi-filtrations by scanning through the slices

Preliminaries
A feature map for multi-parameter persistent homology
Stability
Approximability
Conclusions and future developments
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call