Abstract

Persistent homology (PH) is an approach to topological data analysis (TDA) that computes multi-scale topologically invariant properties of high-dimensional data that are robust to noise. While PH has revealed useful patterns across various applications, computational requirements have limited applications to small data sets of a few thousand points. We present Dory, an efficient and scalable algorithm that can compute the persistent homology of sparse Vietoris–Rips complexes on larger data sets, up to and including dimension two and over the field Z2. As an application, we compute the PH of the human genome at high resolution as revealed by a genome-wide Hi-C data set containing approximately three million points. Extant algorithms were unable to process it, whereas Dory processed it within five minutes, using less than five GB of memory. Results show that the topology of the human genome changes significantly upon treatment with auxin, a molecule that degrades cohesin, corroborating the hypothesis that cohesin plays a crucial role in loop formation in DNA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call