Abstract

Recent single-cell multimodal data reveal multi-scale characteristics of single cells, such as transcriptomics, morphology, and electrophysiology. However, integrating and analyzing such multimodal data to deeper understand functional genomics and gene regulation in various cellular characteristics remains elusive. To address this, we applied and benchmarked multiple machine learning methods to align gene expression and electrophysiological data of single neuronal cells in the mouse brain from the Brain Initiative. We found that nonlinear manifold learning outperforms other methods. After manifold alignment, the cells form clusters highly corresponding to transcriptomic and morphological cell types, suggesting a strong nonlinear relationship between gene expression and electrophysiology at the cell-type level. Also, the electrophysiological features are highly predictable by gene expression on the latent space from manifold alignment. The aligned cells further show continuous changes of electrophysiological features, implying cross-cluster gene expression transitions. Functional enrichment and gene regulatory network analyses for those cell clusters revealed potential genome functions and molecular mechanisms from gene expression to neuronal electrophysiology.

Highlights

  • Recent single-cell multimodal data reveal multi-scale characteristics of single cells, such as transcriptomics, morphology, and electrophysiology

  • The machine learning methods for alignment include linear manifold alignment (LMA) and nonlinear manifold alignment (NMA)[15], manifold warping (MW)[16], manifold alignment based on maximum mean discrepancy measure (MMD-MA)[17], unsupervised topological alignment of single-cell multi-omics integration (UnionCom)[18], Single-Cell alignment using Optimal Transport (SCOT)[19], Manifold Aligning GAN (MAGAN)[20], Canonical Correlation Analysis (CCA), Reduced Rank Regression (RRR)[5,21], Principal Component Analysis (PCA, no alignment) and t-Distributed Stochastic Neighbor Embedding (t-SNE, no alignment)[9]

  • Maximum Mean Discrepancy-Manifold Alignment (MMD-MA) embeds the latent spaces onto a common reproducing kernel Hilbert space by minimizing the MMD across omics

Read more

Summary

Introduction

Recent single-cell multimodal data reveal multi-scale characteristics of single cells, such as transcriptomics, morphology, and electrophysiology. These studies were limited to building an accurate model as a “black box” and lacked any biological interpretability from the box, especially for linking gene expression and functional genomics to various cellular phenotypes To address this challenge, we applied and benchmarked various machine learning methods for data alignment, including manifold learning, an emerging, and nonparametric machine learning approach, to align single-cell gene expression and electrophysiological feature data in the multiple regions of the mouse brain. It identified biologically meaningful cross-modal cell clusters on the latent spaces after the alignment This finding suggests a strong nonlinear relationship (manifold structure) linking genes and electrophysiological features at the cell-type level. Our enrichment analyses for the cell clusters, including GO terms, KEGG pathways, and gene regulatory networks, further revealed the underlying functions and mechanisms from genes to cellular electrophysiology in the mouse brain

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call