Abstract

BackgroundSingle-cell RNA-sequencing enables the opportunity to investigate cell heterogeneity, discover new types of cells and to perform transcriptomic reconstruction at a single-cell resolution. Due to technical inadequacy, the presence of dropout events hinders the downstream and differential expression analysis. Therefore, it demands an efficient and accurate approach to recover the true gene expression. To fill the gap, we present a novel Single-cell RNA dropout imputation method to retrieve the original gene expression of the genes with excessive zero and near-zero counts. ResultHere we have developed CDSImpute (Correlation Distance Similarity Imputation) to identify dropouts induced in scRNA-seq data rather than biological zeros and recover true gene expression. By taking into consideration correlation and negative distance between cells, a similar cell list has been created and by borrowing the gene expression from similar cells dropout has been detected and corrected simultaneously.The improvement is consistent with simulation data and several publicly available scRNA-seq datasets. The clustering accuracy of CDSImpute is evaluated by adjusted rand index on Kolod, Pollen and Usoskin datasets are 1.00, 0.79 and 0.34 respectively. CDSImpute achieves improved performance compared to the three existing methods evaluated by precise cell-type identification and differentially expressed gene detection from scRNA-seq Data. ConclusionCDSImpute is a novel effective method to impute the dropout events of a scRNA-seq expression matrix. The package is implemented in the R language and is available at https://github.com/riasatazim/CDSImpute.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call