Abstract

We present Clustering and Lineage Inference in Single-Cell Transcriptional Analysis (CALISTA), a numerically efficient and highly scalable toolbox for an end-to-end analysis of single-cell transcriptomic profiles. CALISTA includes four essential single-cell analyses for cell differentiation studies, including single-cell clustering, reconstruction of cell lineage specification, transition gene identification, and cell pseudotime ordering, which can be applied individually or in a pipeline. In these analyses, we employ a likelihood-based approach where single-cell mRNA counts are described by a probabilistic distribution function associated with stochastic gene transcriptional bursts and random technical dropout events. We illustrate the efficacy of CALISTA using single-cell gene expression datasets from different single-cell transcriptional profiling technologies and from a few hundreds to tens of thousands of cells. CALISTA is freely available on https://www.cabselab.com/calista.

Highlights

  • The differentiation of stem cells into multiple cell types relies on the dynamic regulation of gene expression (Ralston and Shaw, 2008)

  • The single-cell clustering in CALISTA is an adaptation of the algorithm Simulated Annealing for Bursty Expression Clustering (SABEC) (Ezer et al, 2016), where the single-cell clustering is carried out in two steps as illustrated in Figure 1b: (1) independent runs of maximum likelihood clustering, and (2) consensus clustering

  • We further evaluated CALISTA’s performance using four single-cell gene transcriptional datasets from cell differentiation systems with a variety of lineage topologies, including Bargaje et al study on the differentiation of human induced pluripotent stem cells into cardiomyocytes (Bargaje et al, 2017), Chu et al study on the differentiation of human embryonic stem cells into endodermal cells (Chu et al, 2016), Moignard et al study on hematopoietic stem cell (HSC) differentiation (Moignard et al, 2013), and Treutlein et al study on mouse embryonic fibroblast differentiation into neurons (Treutlein et al, 2016)

Read more

Summary

Introduction

The differentiation of stem cells into multiple cell types relies on the dynamic regulation of gene expression (Ralston and Shaw, 2008). In this regard, advances in single-cell gene transcriptional profiling technology have given a tremendous boost in elucidating the decision making process governing stem cell commitment to different cell fates (Kalisky et al, 2018). The stochastic gene transcription has been shown to generate highly non-Gaussian mRNA count distributions (Raj et al, 2006), which complicate data analysis using established methods that rely on a standard noise distribution model (e.g., Gaussian or Student’s t-distribution)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.