Abstract

Single cell profiling has been proven to be a powerful tool in molecular biology to understand the complex behaviours of heterogeneous system. The definition of the properties of single cells is the primary endpoint of such analysis, cells are typically clustered to underpin the common determinants that can be used to describe functional properties of the cell mixture under investigation. Several approaches have been proposed to identify cell clusters; while this is matter of active research, one popular approach is based on community detection in neighbourhood graphs by optimisation of modularity. In this paper we propose an alternative and principled solution to this problem, based on Stochastic Block Models. We show that such approach not only is suitable for identification of cell groups, it also provides a solid framework to perform other relevant tasks in single cell analysis, such as label transfer. To encourage the use of Stochastic Block Models, we developed a python library, schist, that is compatible with the popular scanpy framework.

Highlights

  • Transcriptome analysis at single cell level by RNA sequencing is a technology growing in popularity and applications [1]

  • It has been applied to study the biology of complex tissues [2, 3], tumor dynamics [4,5,6,7], development [8, 9] and to describe whole organisms [10, 11]

  • A notable example is the analysis of cell trajectories which can be derived from the analysis of Markov processes traversing the Morelli et al BMC Bioinformatics (2021) 22:576

Read more

Summary

Introduction

Transcriptome analysis at single cell level by RNA sequencing (scRNA-seq) is a technology growing in popularity and applications [1]. As the popularity of single cell analysis frameworks Seurat [21] and scanpy [22] raised, methods based instead on graph partitioning became the de facto standards. Such methods require the construction of a cell neighbourhood graph (e.g. by k Nearest Neighbours, kNN, or shared Nearest Neighbours, sNN). Encoding cell-to-cell similarities into graphs has practical advantages beyond clustering, as many algorithms for graph analysis can be applied and interpreted in a biological way. A notable example is the analysis of cell trajectories which can be derived from the analysis of Markov processes traversing the Morelli et al BMC Bioinformatics (2021) 22:576

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call