Abstract

BackgroundSingle-cell RNA sequencing (scRNA-seq) provides an effective tool to investigate the transcriptomic characteristics at the single-cell resolution. Due to the low amounts of transcripts in single cells and the technical biases in experiments, the raw scRNA-seq data usually includes large noise and makes the downstream analyses complicated. Although many methods have been proposed to impute the noisy scRNA-seq data in recent years, few of them take into account the prior associations across genes in imputation and integrate multiple types of imputation data to identify cell types.ResultsWe present a new framework, NetImpute, towards the identification of cell types from scRNA-seq data by integrating multiple types of biological networks. We employ a statistic method to detect the noise data items in scRNA-seq data and develop a new imputation model to estimate the real values of data noise by integrating the PPI network and gene pathways. Meanwhile, based on the data imputed by multiple types of biological networks, we propose an integrated approach to identify cell types from scRNA-seq data. Comprehensive experiments demonstrate that the proposed network-based imputation model can estimate the real values of noise data items accurately and integrating the imputation data based on multiple types of biological networks can improve the identification of cell types from scRNA-seq data.ConclusionsIncorporating the prior gene associations in biological networks can potentially help to improve the imputation of noisy scRNA-seq data and integrating multiple types of network-based imputation data can enhance the identification of cell types. The proposed NetImpute provides an open framework for incorporating multiple types of biological network data to identify cell types from scRNA-seq data.

Highlights

  • Single-cell RNA sequencing provides an effective tool to investigate the transcriptomic characteristics at the single-cell resolution

  • The main contributions of this study can be summarized as follows: (1) We propose a new imputation model to estimate the real values of noise data items in scRNA-seq data by taking into account the association information across genes based on biological networks

  • Datasets and data processing To evaluate the effectiveness of the proposed imputation method-NetImpute, we used three public scRNA-seq datasets in the Gene Expression Omnibus (GEO) database in our experiemnts

Read more

Summary

Introduction

Single-cell RNA sequencing (scRNA-seq) provides an effective tool to investigate the transcriptomic characteristics at the single-cell resolution. Due to the low amounts of transcripts in single cells and the technical biases in experiments, the raw scRNA-seq data usually includes large noise and makes the downstream analyses complicated. ScRNA-seq data usually have relatively higher noise than the bulk-cell RNA sequencing data due to the low amounts of transcripts in single cells and sequencing technical biases [6, 7]. The most well-known noise type in scRNA-seq data is the dropout events, where a gene expressed even at a high level but was not detected in sequencing due to the limitation of technical sensitivity [8, 9].

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call