Abstract

The transcriptome of single cells can reveal important information about cellular states and heterogeneity within populations of cells. Recently, single-cell RNA-sequencing has facilitated expression profiling of large numbers of single cells in parallel. To fully exploit these data, it is critical that suitable computational approaches are developed. One key challenge, especially pertinent when considering dividing populations of cells, is to understand the cell-cycle stage of each captured cell. Here we describe and compare five established supervised machine learning methods and a custom-built predictor for allocating cells to their cell-cycle stage on the basis of their transcriptome. In particular, we assess the impact of different normalisation strategies and the usage of prior knowledge on the predictive power of the classifiers. We tested the methods on previously published datasets and found that a PCA-based approach and the custom predictor performed best. Moreover, our analysis shows that the performance depends strongly on normalisation and the usage of prior knowledge. Only by leveraging prior knowledge in form of cell-cycle annotated genes and by preprocessing the data using a rank-based normalisation, is it possible to robustly capture the transcriptional cell-cycle signature across different cell types, organisms and experimental protocols.

Highlights

  • Recent technological advances have helped to establish single-cell RNA-sequencing as a robust and routine assay, enabling the transcriptional profiling of thousands of cells to be processed in an unbiased manner [1, 2]

  • 2.1.4 PCA-based classification Recently, we showed that the first principal component (PC) of a set of annotated cell cycle marker genes is sufficient for constructing a cell-cell covariance matrix, reflecting the cell cycle induced correlation among cells [14]

  • Poor generalizability to independent test data for many methods but PCA and pairs method To assess how well the different approaches generalize to independent data sets, we tested the six predictors derived on an independent test set of 35 mouse embryonic stem cells (mESCs) sequenced using a different protocol (Quartz-seq) and cultured in a different medium

Read more

Summary

Introduction

Recent technological advances have helped to establish single-cell RNA-sequencing (scRNA-seq) as a robust and routine assay, enabling the transcriptional profiling of thousands of cells to be processed in an unbiased manner [1, 2]. ScRNA-seq has helped to identify novel cell types [5] and to reveal dynamic changes of the transcriptome during temporal processes like cell differentiation [6]. Strategies based on genetic manipulation through insertion of fluorescent probes in genes that are differentially expressed in different cell-cycle stages (e.g., FUCCI technique [13]) can be employed. These approaches have major drawbacks as they can be very labour extensive and, due to their invasive nature, have the potential to disturb the biological system substantially (e.g., cell-cycle arrest can have a large impact on differentiation potential)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.