Abstract

Cellular signaling systems play a vital role in maintaining homeostasis when a cell is exposed to different perturbations. Components of the systems are organized as hierarchical networks, and perturbing different components often leads to transcriptomic profiles that exhibit compositional statistical patterns. Mining such patterns to investigate how cellular signals are encoded is an important problem in systems biology, where artificial intelligence techniques can be of great assistance. Here, we investigated the capability of deep generative models (DGMs) to modeling signaling systems and learn representations of cellular states underlying transcriptomic responses to diverse perturbations. Specifically, we show that the variational autoencoder and the supervised vector-quantized variational autoencoder can accurately regenerate gene expression data in response to perturbagen treatments. The models can learn representations that reveal the relationships between different classes of perturbagens and enable mappings between drugs and their target genes. In summary, DGMs can adequately learn and depict how cellular signals are encoded. The resulting representations have broad applications, demonstrating the power of artificial intelligence in systems biology and precision medicine.

Highlights

  • A cellular signaling system is a signal processing machine that detects changes in the internal or external environment, encodes these changes as cellular signals, and eventually transmits these signals to effectors, which adjusts cellular responses

  • The first model was trained on the smallmolecule perturbagen (SMP) dataset, which contains 85,183 expression profiles from seven cell lines treated with small molecules (Supplementary Table 1)

  • We examined the utility of deep generative models (DGMs), variational autoencoder (VAE) and S-vector-quantized VAE (VQ-VAE), for learning representations of the cellular states of cells treated with different perturbagens in the Library of Integrated Network-based Cellular Signatures (LINCS) project

Read more

Summary

Introduction

A cellular signaling system is a signal processing machine that detects changes in the internal or external environment, encodes these changes as cellular signals, and eventually transmits these signals to effectors, which adjusts cellular responses . Cellular responses to perturbations often involve changes in transcriptomic programs[1,2,3]. A common approach is to systematically perturb a cellular system with genetic or pharmacological perturbagens and monitor transcriptomic changes in order to reverse engineer the system and gain insights into how cellular signals are encoded and transmitted. This approach has been employed in many largescale systems biology studies, e.g., the yeast deletion library[4], the Connectivity Map project[5,6], and most recently, the Library of Integrated Network-based Cellular Signatures (LINCS)[7,8]. The LINCS project is arguably the most comprehensive systematic perturbation dataset currently available, in which multiple cell lines were treated with over tens of thousands perturbagens (e.g., small molecules or single gene knockdowns), followed by monitoring gene expression profiles using a new technology known as the L1000 assay, which utilizes ~1000 (978) landmark genes to infer the entire transcriptome[7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call