Abstract

The development of single-cell RNA sequencing (scRNA-seq) has enabled gene expression to be quantified at single-cell resolution. Such advancement is expected to solve important issues that bulk RNA sequencing could not fully answer, such as inferring cell population heterogeneity, genetic variability of cells, detecting rare cell types, accurately predicting cell states and their localization. However, analyzing such large scale data, especially when they are sampled at multiple time points, brings new challenges in data mining informative genes, compared to single snapshot samples. It becomes even more complicated when gene expression patterns are to be mined from time-series scRNA-seq datasets generated from multiple conditions, which will constitute a data with gene, condition and time dimensions. Here, we focused on detecting gene expression patterns that well capture the underlying biological differences between time-series scRNA-seq datasets of three different types of stem cells. The gene expression profile of 2,128 time-series scRNA-seq samples from long-term hematopoietic stem cells (LT-HSC) and two of its progenitor cell types were analyzed using our framework. We have successfully detected condition specific feature genes that were able to achieve 90.03% classification accuracy between the three cell types. Investigating the genes and clusters detected by our framework, we found that cell cycle related genes showed significantly high variance between the three cell types. Such results and transcriptomic characters detected from our analysis were consistent with the original study. Collectively, the framework was able to successfully detect biological meaningful gene sets and expression patterns from multi-condition time-series scRNA-seq samples.

Highlights

  • The advent and rapid advancement of single-cell RNA-seq data has brought enormous attention to the field of transcriptomics in the recent years

  • Based on bulk RNA-seq data, we have previously developed a multi-class time-series clustering tool, TimesVector [11], which was successful in finding gene clusters with significantly distinctive expression patterns in the rice plants that were treated with four different hormones [12]

  • In order to build an accurate classifier of cell types and successfully detect condition specific time-series gene expression patterns we propose a framework for integrating and analyzing large sets of multi-class time-series single-cell RNA-seq data

Read more

Summary

Introduction

The advent and rapid advancement of single-cell RNA-seq (scRNA-seq) data has brought enormous attention to the field of transcriptomics in the recent years. The gene expression from bulk RNA-seq represents the transcript quantity in a population basis, whereas single-cell RNA-seq gives us a snapshot of transcript abundance in each cell. The function of a cell type is mostly defined by proteins expressed within the cell, which. A single bulk RNA-seq sample can be viewed as a representative of a large number of sub-population from different cell types. While the single-cell sequencing techniques allow us to monitor global gene regulation in thousands of individual cells in a single experiment, which may lead to discovering new cell types or even altering the way of defining them. By successfully detecting such cell type specific genes, the initially unknown cell types can be inferred and improve the comprehension of biological mechanism under interest

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call