Abstract

The emergence of single-cell sequencing technologies has enabled the production of high-resolution data at the individual cell level, providing unprecedented opportunities to capture cell population diversity and dissect the cellular heterogeneity of complex diseases. At the same time, relatively high biological and technical noise poses new challenges for single-cell data analysis. The single-cell RNA sequencing (scRNA-seq) data often contains substantial missing values due to gene dropout events. Here, we developed a convolutional neural network based model to recover missing values for scRNA-seq data. We first calculated the probability of dropout employing gamma-normal expectation maximum algorithm. Unlike most existing approaches, our model only recovered the expression values that have a dropout probability larger than a threshold. The mean square error and Pearson correlation coefficient were used to assess the accuracy of predicted expression values. The purity and entropy were computed to measure the homogeneity of cell clusters using imputed gene expression profiles. Across various scRNAseq datasets, our model demonstrated robust performance and achieved comparable or better results compared to the other imputation methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call