Abstract

Motivation: High-throughput single-cell quantitative real-time polymerase chain reaction (qPCR) is a promising technique allowing for new insights in complex cellular processes. However, the PCR reaction can be detected only up to a certain detection limit, whereas failed reactions could be due to low or absent expression, and the true expression level is unknown. Because this censoring can occur for high proportions of the data, it is one of the main challenges when dealing with single-cell qPCR data. Principal component analysis (PCA) is an important tool for visualizing the structure of high-dimensional data as well as for identifying subpopulations of cells. However, to date it is not clear how to perform a PCA of censored data. We present a probabilistic approach that accounts for the censoring and evaluate it for two typical datasets containing single-cell qPCR data.Results: We use the Gaussian process latent variable model framework to account for censoring by introducing an appropriate noise model and allowing a different kernel for each dimension. We evaluate this new approach for two typical qPCR datasets (of mouse embryonic stem cells and blood stem/progenitor cells, respectively) by performing linear and non-linear probabilistic PCA. Taking the censoring into account results in a 2D representation of the data, which better reflects its known structure: in both datasets, our new approach results in a better separation of known cell types and is able to reveal subpopulations in one dataset that could not be resolved using standard PCA.Availability and implementation: The implementation was based on the existing Gaussian process latent variable model toolbox (https://github.com/SheffieldML/GPmat); extensions for noise models and kernels accounting for censoring are available at http://icb.helmholtz-muenchen.de/censgplvm.Contact: fbuettner.phys@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online.

Highlights

  • 1.1 High-throughput single-cell quantitative real-time polymerase chain reaction (qPCR) To gain fundamental insights into complex cellular processes, it is necessary to observe individual cells

  • It can be seen that in both datasets a considerable fraction of data is censored across some dimensions, whereas for other dimensions no censoring occurred

  • In the case of mouse embryonic stem cells (mESC) data, the structure of subpopulations was reflected better in the case when censoring was taken into account in the non-linear case: in contrast to nonlinear probabilistic Principal component analysis (PCA) with the substitution approach, two subpopulations corresponding to cells from the 16-cell stage with high Id2 expression and cells in the inner cell mass (ICM) with high Fgf4 expression could be identified

Read more

Summary

Introduction

1.1 High-throughput single-cell qPCR To gain fundamental insights into complex cellular processes, it is necessary to observe individual cells. One such process is the transcriptional control of cell fate decisions, where it is crucial to quantify the gene expression of individual cells because cell fate decisions are made on a single-cell level. In contrast to single-cell measurements, conventional experimental techniques measure gene expression from pools of cells masking heterogeneities within cell populations, which may be important for understanding underlying biological processes (Dalerba et al, 2011; Dominguez et al, 2013; Guo et al, 2010; Moignard et al, 2013; Pina et al, 2012). Recent technical advances facilitate the simultaneous measurement of tens to thousands of genes in hundreds of individual cells (Taniguchi et al, 2009). The messenger RNA content of single cells can be analysed using high-throughput quantitative real-time polymerase chain reaction (qPCR) platforms, such as the Fluidigm BioMark HD, or using deep sequencing [RNA sequencing (RNA-Seq)]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.