Abstract

Transcriptomics technologies such as next-generation sequencing and microarray platforms provide exciting opportunities for improving diagnosis and treatment of complex diseases. Transcriptomics studies often share similar hypotheses, but are carried out on different platforms, in different conditions, and with different analysis approaches. These factors, in addition to small sample sizes, can result in a lack of reproducibility. A clear understanding and unified picture of many complex diseases are still elusive, highlighting an urgent need to effectively integrate multiple transcriptomic studies for disease signatures. We have integrated more than 3,000 high-quality transcriptomic datasets in oncology, immunology, neuroscience, cardiovascular and metabolic disease, and from both public and internal sources (DiseaseLand database). We established a systematic data integration and meta-analysis approach, which can be applied in multiple disease areas to create a unified picture of the disease signature and prioritize drug targets, pathways, and compounds. In this bipolar case study, we provided an illustrative example using our approach to combine a total of 30 genome-wide gene expression studies using postmortem human brain samples. First, the studies were integrated by extracting raw FASTQ or CEL files, then undergoing the same procedures for preprocessing, normalization, and statistical inference. Second, both p-value and effect size based meta-analysis algorithms were used to identify a total of 204 differentially expressed (DE) genes (FDR < 0.05) genes in the prefrontal cortex. Among these were BDNF, VGF, WFS1, DUSP6, CRHBP, MAOA, and RELN, which have previously been implicated in bipolar disorder. Finally, pathway enrichment analysis revealed a role for GPCR, MAPK, immune, and Reelin pathways. Compound profiling analysis revealed MAPK and other inhibitors may modulate the DE genes. The ability to robustly combine and synthesize the information from multiple studies enables a more powerful understanding of this complex disease.

Highlights

  • Transcriptomics technologies such as next-generation sequencing (NGS) based RNA-Sequencing (RNA-Seq) and DNA chip based gene expression microarray provide a highthroughput and cost-effective solution to evaluate whole-genome gene expression signatures (Ramasamy et al, 2008; Wu et al, 2017a)

  • Not significant after multiple test correction, these differentially expressed (DE) genes showed an enrichment in mental depression (DOID:1596, p-value = 0.004), mood disorder (DOID:3324, p-value = 0.005), and schizoaffective disorder (DOID:5418, p-value = 0.01) (Supplementary Table S6)

  • Compounds significantly associated with an increase or decrease in bipolar-associated gene expression changes were listed in Supplementary Tables S7, S8

Read more

Summary

Introduction

Transcriptomics technologies such as next-generation sequencing (NGS) based RNA-Sequencing (RNA-Seq) and DNA chip based gene expression microarray provide a highthroughput and cost-effective solution to evaluate whole-genome gene expression signatures (Ramasamy et al, 2008; Wu et al, 2017a). These platforms enable researchers to measure tens of thousands of genes simultaneously and have become one of the most widely used approaches in biological research. Numerous omics studies on human diseases and animal models are published each year. There is a clear need to effectively manage, integrate, and synthesize the information from related transcriptomics studies to improve our understanding and generate a unified picture of complex diseases

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call