Abstract
Single-cell DNA methylation sequencing technology has brought new perspectives to investigate epigenetic heterogeneity, supporting a need for computational methods to cluster cells based on single-cell methylation profiles. Although several methods have been developed, most of them cluster cells based on single (dis)similarity measures, failing to capture complete cell heterogeneity and resulting in locally optimal solutions. Here, we present scMelody, which utilizes an enhanced consensus-based clustering model to reconstruct cell-to-cell methylation similarity patterns and identifies cell subpopulations with the leveraged information from multiple basic similarity measures. Besides, benefitted from the reconstructed cell-to-cell similarity measure, scMelody could conveniently leverage the clustering validation criteria to determine the optimal number of clusters. Assessments on distinct real datasets showed that scMelody accurately recapitulated methylation subpopulations and outperformed existing methods in terms of both cluster partitions and the number of clusters. Moreover, when benchmarking the clustering stability of scMelody on a variety of synthetic datasets, it achieved significant clustering performance gains over existing methods and robustly maintained its clustering accuracy over a wide range of number of cells, number of clusters and CpG dropout proportions. Finally, the real case studies demonstrated the capability of scMelody to assess known cell types and uncover novel cell clusters.
Highlights
As a heritable covalent chemical modification, DNA methylation is closely correlated with cell growth, differentiation, and transformation, which plays decisive roles in diseases and tumorigenesis (Aran and Hellman, 2013; Oakes et al, 2016; Koch et al, 2018)
We found that scMelody spent more than 99% of the running time on calculating the basic cell-to-cell similarity matrices for the input single-cell methylation profiles (Supplementary Figure S4) and this was true for single-distance-based methods, such as PearsonHC and PDclust
We propose scMelody, an enhanced consensusbased clustering model for single-cell methylation data analysis by reconstructing cell-to-cell pairwise similarity
Summary
As a heritable covalent chemical modification, DNA methylation is closely correlated with cell growth, differentiation, and transformation, which plays decisive roles in diseases and tumorigenesis (Aran and Hellman, 2013; Oakes et al, 2016; Koch et al, 2018). Recent advancements in ensemble clustering (Ghaemi et al, 2009; Vega-Pons and Ruiz-Shulcloper, 2011; Boongoen and IamOn, 2018) have demonstrated that integrating various basic cell partitions in a consensus matrix is effective to generate improved clustering solutions (Kiselev et al, 2017; Zhu et al, 2020; Cui et al, 2021; Wang et al, 2021). GSE56879 GSE65196 GSE65364 GSE83882 GSE87197 GSE97179 GSE97179 the most advanced performance over previous methods in clustering single-cell methylation data
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have