CDSKNNXMBD: a novel clustering framework for large-scale single-cell data based on a stable graph structure

Jun Ren,Jun Ren,Jun Ren,Xuejing Lyu,Jintao Guo,Xiaodong Shi,Ying Zhou,Ying Zhou,Qiyuan Li,Qiyuan Li

doi:10.1186/s12967-024-05009-w

Abstract

BackgroundAccurate and efficient cell grouping is essential for analyzing single-cell transcriptome sequencing (scRNA-seq) data. However, the existing clustering techniques often struggle to provide timely and accurate cell type groupings when dealing with datasets with large-scale or imbalanced cell types. Therefore, there is a need for improved methods that can handle the increasing size of scRNA-seq datasets while maintaining high accuracy and efficiency.MethodsWe propose CDSKNNXMBD (Community Detection based on a Stable K-Nearest Neighbor Graph Structure), a novel single-cell clustering framework integrating partition clustering algorithm and community detection algorithm, which achieves accurate and fast cell type grouping by finding a stable graph structure.ResultsWe evaluated the effectiveness of our approach by analyzing 15 tissues from the human fetal atlas. Compared to existing methods, CDSKNN effectively counteracts the high imbalance in single-cell data, enabling effective clustering. Furthermore, we conducted comparisons across multiple single-cell datasets from different studies and sequencing techniques. CDSKNN is of high applicability and robustness, and capable of balancing the complexities of across diverse types of data. Most importantly, CDSKNN exhibits higher operational efficiency on datasets at the million-cell scale, requiring an average of only 6.33 min for clustering 1.46 million single cells, saving 33.3% to 99% of running time compared to those of existing methods.ConclusionsThe CDSKNN is a flexible, resilient, and promising clustering tool that is particularly suitable for clustering imbalanced data and demonstrates high efficiency on large-scale scRNA-seq datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CDSKNNXMBD: a novel clustering framework for large-scale single-cell data based on a stable graph structure

Abstract

Talk to us

Similar Papers

More From: Journal of translational medicine

Lead the way for us

Journal: Journal of translational medicine	Publication Date: Mar 3, 2024
License type: CC BY 4.0

Similar Papers

Abstract LB019: Trisicell: Scalable Tumor Phylogeny Reconstruction and Validation Reveals Developmental Origin and Therapeutic Impact of Intratumoral Heterogeneity
Farid Rashidi Mehrabadi ... Huaitian Liu
Cancer Research | VOL. 81
Farid Rashidi Mehrabadi, et. al.Farid Rashidi Mehrabadi ... Huaitian Liu
01 Jul 2021
Cancer Research | VOL. 81

Single-cell co-expression analysis reveals that transcriptional modules are shared across cell types in the brain.
Benjamin D Harris ... Jesse Gillis
Cell Systems | VOL. 12
Benjamin D Harris, et. al.Benjamin D Harris ... Jesse Gillis
10 May 2021
Cell Systems | VOL. 12

Cultures of the Central Highlands, New Guinea
K E Read
Southwestern Journal of Anthropology | VOL. 10
K E ReadK E Read
01 Apr 1954
Southwestern Journal of Anthropology | VOL. 10

Conifer: clonal tree inference for tumor heterogeneity with single-cell and bulk sequencing data
Leila Baghaarabani ... Bahram Goliaei
BMC Bioinformatics | VOL. 22
Leila Baghaarabani, et. al.Leila Baghaarabani ... Bahram Goliaei
30 Aug 2021
BMC Bioinformatics | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CDSKNNXMBD: a novel clustering framework for large-scale single-cell data based on a stable graph structure

Abstract

Talk to us

Similar Papers

More From: Journal of translational medicine