Abstract

Recent studies on using single-cell RNA sequencing (scRNA-seq) technology have been widely applied in biological studies such as drug discovery. Prior to in-depth investigations of the functionality of single cells for pathological goals, identification of cell types is an essential step that can be sped up using computational methods. Recently, supervised learning methods have been developed to automatically identify cell types. Due to the lack of sufficient annotated datasets, these methods have not been commonly used in scRNA-seq studies. Classification methods can simply take advantage of feature selection techniques to improve cell type prediction while identifying the most informative genes among a high number of genes in high-dimensional scRNA-seq datasets. In this regard, we introduce a combination of two powerful techniques for representation learning and unsupervised feature selection to automatically achieve cell type identification in two steps. Average prediction accuracy of 98% obtained on six different cell types in a Human Pancreas scRNA-seq dataset. In addition, we found that 11 out of 13 selected genes are biologically related to two cell types in the Human Pancreas, which confirms the effectiveness of the proposed approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call