A novel joint analysis framework improves identification of differentially expressed genes in cross disease transcriptomic analysis

Wenyi Qin,Hui Lu

doi:10.1186/s13040-018-0163-y

Wenyi Qin, Hui Lu

Open Access

https://doi.org/10.1186/s13040-018-0163-y

Copy DOI

Journal: BioData Mining	Publication Date: Feb 20, 2018
Citations: 6	License type: open-access

Affiliation: University of Illinois at Chicago

Abstract

MotivationDetecting differentially expressed (DE) genes between disease and normal control group is one of the most common analyses in genome-wide transcriptomic data. Since most studies don’t have a lot of samples, researchers have used meta-analysis to group different datasets for the same disease. Even then, in many cases the statistical power is still not enough. Taking into account the fact that many diseases share the same disease genes, it is desirable to design a statistical framework that can identify diseases’ common and specific DE genes simultaneously to improve the identification power.ResultsWe developed a novel empirical Bayes based mixture model to identify DE genes in specific study by leveraging the shared information across multiple different disease expression data sets. The effectiveness of joint analysis was demonstrated through comprehensive simulation studies and two real data applications. The simulation results showed that our method consistently outperformed single data set analysis and two other meta-analysis methods in identification power. In real data analysis, overall our method demonstrated better identification power in detecting DE genes and prioritized more disease related genes and disease related pathways than single data set analysis. Over 150% more disease related genes are identified by our method in application to Huntington’s disease. We expect that our method would provide researchers a new way of utilizing available data sets from different diseases when sample size of the focused disease is limited.

Highlights

High-throughput technology like microarray and next-generation sequencing (NGS) allows researchers measure thousands of gene or microRNA expression in one sample simultaneously
In this paper, we present a novel statistical framework which aims at addressing a problem often met by biological researchers: when only a limited number of sample for a specific disease is available, the identification power could be improved by jointly analyzing multiple similar disease data sets because differentially expressed (DE) genes might be shared among similar diseases
By implementing a two-component mixture model, we demonstrate the framework could improve the identification power through comprehensive simulation studies and two real data applications

Summary

Introduction

High-throughput technology like microarray and next-generation sequencing (NGS) allows researchers measure thousands of gene or microRNA expression in one sample simultaneously. Qin and Lu BioData Mining (2018) 11:3 clinical diagnosis tools and investigating potential drug targets This approach has been successfully applied in many complex diseases like cancers [10, 13] and diabetes [5, 33]. With the cost of microarray and generation sequencing technique decreasing and stabilization of the experiment protocol, there are over 1,000,000+ samples deposited in public databases such as Gene Expression Ominus (GEO) [6]. With this huge amount of public data available, it is possible for researchers to perform cross disease transcriptomics comparison analysis. The cross disease transcriptomic analysis has provided researchers with new opportunities of understanding of mechanisms of complex disease and discovery of new biomarkers

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel joint analysis framework improves identification of differentially expressed genes in cross disease transcriptomic analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioData Mining

Lead the way for us

Similar Papers

A Novel Joint Gene Set Analysis Framework Improves Identification of Enriched Pathways in Cross Disease Transcriptomic Analysis.
Wenyi Qin ... Hongyu Zhao
Frontiers in genetics | VOL. 10
Wenyi Qin, et. al.Wenyi Qin ... Hongyu Zhao
12 Apr 2019
Frontiers in genetics | VOL. 10

Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data
Tianyu Wang ... Boyang Li
BMC Bioinformatics | VOL. 20
Tianyu Wang, et. al.Tianyu Wang ... Boyang Li
18 Jan 2019
BMC Bioinformatics | VOL. 20

Decision letter: Major transcriptional changes observed in the Fulani, an ethnic group less susceptible to malaria
-
-
--
14 Jul 2017
14 Jul 2017

Identification of differentially expressed miRNAs and key genes involved in the progression of alcoholic fatty liver disease using rat models
Xuemei Zhang ... Shizhu Jin
Clinics and Research in Hepatology and Gastroenterology | VOL. 46
Xuemei Zhang, et. al.Xuemei Zhang ... Shizhu Jin
25 Aug 2022
Clinics and Research in Hepatology and Gastroenterology | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel joint analysis framework improves identification of differentially expressed genes in cross disease transcriptomic analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioData Mining