Abstract

Here, we present a multi-modal deep generative model, the single-cell Multi-View Profiler (scMVP), which is designed for handling sequencing data that simultaneously measure gene expression and chromatin accessibility in the same cell, including SNARE-seq, sci-CAR, Paired-seq, SHARE-seq, and Multiome from 10X Genomics. scMVP generates common latent representations for dimensionality reduction, cell clustering, and developmental trajectory inference and generates separate imputations for differential analysis and cis-regulatory element identification. scMVP can help mitigate data sparsity issues with imputation and accurately identify cell groups for different joint profiling techniques with common latent embedding, and we demonstrate its advantages on several realistic datasets.

Highlights

  • Cis-regulatory elements (CREs), which are bound by combinations of transcription factors, drive cell-type-specific and time-dependent regulation of gene expression

  • The basic idea of single-cell Multi-View Profiler (scMVP) is to introduce a Gaussian mixture model (GMM) prior to derive the common latent embedding by maximizing the likelihood of the joint generation probability of the multi-omic data, which is implemented as a multi-modal asymmetric Gaussian Mixture Model (GMM)-variational autoencoder (VAE) model with two extra clustering consistency modules to align each imputed omics and preserve the common semantic information, and used to impute missing data, cluster cell groups, assemble multiple modalities, and construct a developmental lineage

  • The generated scRNA and scATAC data are denoised and imputed by the mean of the corresponding output distribution, respectively, while the embedded common latent code z can be used for a series of downstream analysis, e.g., visualization, trajectory analysis, and which is inferenced through a variational process by maximizing the variational evidence lower bound (ELBO), that is, Lelboðx; yÞ 1⁄4 Eqðz;cjx;yÞ1⁄2 log qpððzx;;cyj;xz;;ycÞފ: scMVP estimates the distribution parameters of the q(z, c| x, y) according to another joint Encoder neural network, e.g., the mean μz and variance σz for z = μc + σcI, I~N(0, 1) using a reparameterization trick for the gradient back-propagation

Read more

Summary

Introduction

Cis-regulatory elements (CREs), which are bound by combinations of transcription factors, drive cell-type-specific and time-dependent regulation of gene expression. Genome-wide mapping of CREs and their activity patterns across cells and tissues can provide insights into the mechanisms of gene regulation. As CREs are mostly located in open chromatin regions, epigenomic sequencing technologies such as DNase-seq [1, 2] and ATAC-seq [3] have been developed to detect open chromatin regions and measure chromatin accessibility in tissues and cells. The advancement of single-cell technologies, such as scRNA-seq [4, 5] and scATAC-seq [6, 7], provides powerful tools to uncover complex and dynamic gene regulatory networks during tissue development across different cell types. 10X Genomics recently developed a “multiome” approach.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.