Abstract
The need to understand cell developmental processes spawned a plethora of computational methods for discovering hierarchies from scRNAseq data. However, existing techniques are based on Euclidean geometry, a suboptimal choice for modeling complex cell trajectories with multiple branches. To overcome this fundamental representation issue we propose Poincaré maps, a method that harness the power of hyperbolic geometry into the realm of single-cell data analysis. Often understood as a continuous extension of trees, hyperbolic geometry enables the embedding of complex hierarchical data in only two dimensions while preserving the pairwise distances between points in the hierarchy. This enables the use of our embeddings in a wide variety of downstream data analysis tasks, such as visualization, clustering, lineage detection and pseudotime inference. When compared to existing methods — unable to address all these important tasks using a single embedding — Poincaré maps produce state-of-the-art two-dimensional representations of cell trajectories on multiple scRNAseq datasets.
Highlights
The need to understand cell developmental processes spawned a plethora of computational methods for discovering hierarchies from scRNAseq data
Given feature representations of cells such as their gene expressions, we aim to estimate the structure of the underlying tree-like manifold in three main steps (Fig. 1 and Methods): First, we compute a connected knearest-neighbor graph[26] where each node corresponds to an individual cell and each edge has a weight proportional to the Euclidean distance between the features of the two connected cells
We compute global geodesic distances from the kNN graph, by traveling between all pairs of points along the weighted edges. This step can be computed efficiently using all pairs of shortest paths, or related measures such as the Relative Forest Accessibilities (RFA) index[27]
Summary
The need to understand cell developmental processes spawned a plethora of computational methods for discovering hierarchies from scRNAseq data. Computational methods to accurately discover and represent cell development processes from large datasets and noisy measurements are in great demand This is a challenging task since methods are required to reveal the progression of cells along continuous trajectories with tree-like structures and multiple branches (e.g., as in Waddington’s classic epigenetic landscape[5]). To visualize hierarchical relationships in cell development, many state-of-the-art methods embed cell measurements in lowdimensional Euclidean spaces[7,8,17,18] This approach is limited when modeling complex hierarchies, as low-dimensional Euclidean embeddings distort pairwise distances between measurements substantially. Monocle 215 forces a Waddington’s epigenetic landscape b Hyperbolic space c
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.