Abstract

Although single-cell RNA sequencing (scRNA-seq) technology is newly invented and a promising one, but because of lack of enough information that labels individual cells, it is hard to interpret the obtained gene expression of each cell. Because of insufficient information available, unsupervised clustering, for example, t-distributed stochastic neighbor embedding and uniform manifold approximation and projection, is usually employed to obtain low-dimensional embedding that can help to understand cell–cell relationship. One possible drawback of this strategy is that the outcome is highly dependent upon genes selected for the usage of clustering. In order to fulfill this requirement, there are many methods that performed unsupervised gene selection. In this study, a tensor decomposition (TD)-based unsupervised feature extraction (FE) was applied to the integration of two scRNA-seq expression profiles that measure human and mouse midbrain development. TD-based unsupervised FE could select not only coincident genes between human and mouse but also biologically reliable genes. Coincidence between two species as well as biological reliability of selected genes is increased compared with that using principal component analysis (PCA)-based FE applied to the same data set in the previous study. Since PCA-based unsupervised FE outperformed the other three popular unsupervised gene selection methods, highly variable genes, bimodal genes, and dpFeature, TD-based unsupervised FE can do so as well. In addition to this, 10 transcription factors (TFs) that might regulate selected genes and might contribute to midbrain development were identified. These 10 TFs, BHLHE40, EGR1, GABPA, IRF3, PPARG, REST, RFX5, STAT3, TCF7L2, and ZBTB33, were previously reported to be related to brain functions and diseases. TD-based unsupervised FE is a promising method to integrate two scRNA-seq profiles effectively.

Highlights

  • Single-cell RNA sequencing (Sasagawa et al, 2019) is a newly invented technology that enables us to measure the amount of RNA in a single-cell basis

  • We propose the application of tensor decomposition (TD)-based unsupervised feature extraction (FE) (Taguchi, 2017a; Taguchi, 2017b; Taguchi, 2017c; Taguchi, 2017e; Taguchi, 2017f; Taguchi and Ng, 2018; Taguchi, 2018b; Taguchi, 2018c; Taguchi, 2019a)

  • Midbrain Development of Humans and Mice The first scRNA-seq data used in this study were downloaded from Gene Expression Omnibus (GEO) under the GEO ID GSE76381; the files named “GSE76381_EmbryoMoleculeCounts.cef.txt.gz” and “SE76381_MouseEmbryoMoleculeCounts.cef. txt.gz” were downloaded

Read more

Summary

Introduction

Single-cell RNA sequencing (scRNA-seq) (Sasagawa et al, 2019) is a newly invented technology that enables us to measure the amount of RNA in a single-cell basis. Singular value vectors attributed to genes of human and mouse scRNA-seq, u i ∈ N×M and v i ∈ N×K are defined as respectively. As a result, following the procedure described in the Methods and Materials, we identified 55 and 44 singular value vectors attributed to cells, uljs and vlks for human and mouse, respectively.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call