Mergeomics: multidimensional data integration to identify pathogenic perturbations to biological systems.

Le Shu,Johannes Kettunen,Xia Yang,Aldons J Lusis,Samuli Ripatti,Zeyneb Kurt,Ville-Petteri Mäkinen,Bin Zhang,Matteo Pellegrini,Yuqi Zhao,Luz D Orozco,Michael Inouye,Sean Geoffrey Byars,Taru Tukiainen

doi:10.1186/s12864-016-3198-9

Abstract

BackgroundComplex diseases are characterized by multiple subtle perturbations to biological processes. New omics platforms can detect these perturbations, but translating the diverse molecular and statistical information into testable mechanistic hypotheses is challenging. Therefore, we set out to create a public tool that integrates these data across multiple datasets, platforms, study designs and species in order to detect the most promising targets for further mechanistic studies.ResultsWe developed Mergeomics, a computational pipeline consisting of independent modules that 1) leverage multi-omics association data to identify biological processes that are perturbed in disease, and 2) overlay the disease-associated processes onto molecular interaction networks to pinpoint hubs as potential key regulators. Unlike existing tools that are mostly dedicated to specific data type or settings, the Mergeomics pipeline accepts and integrates datasets across platforms, data types and species. We optimized and evaluated the performance of Mergeomics using simulation and multiple independent datasets, and benchmarked the results against alternative methods. We also demonstrate the versatility of Mergeomics in two case studies that include genome-wide, epigenome-wide and transcriptome-wide datasets from human and mouse studies of total cholesterol and fasting glucose. In both cases, the Mergeomics pipeline provided statistical and contextual evidence to prioritize further investigations in the wet lab. The software implementation of Mergeomics is freely available as a Bioconductor R package.ConclusionMergeomics is a flexible and robust computational pipeline for multidimensional data integration. It outperforms existing tools, and is easily applicable to datasets from different studies, species and omics data types for the study of complex traits.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3198-9) contains supplementary material, which is available to authorized users.

Highlights

In parallel to large-scale genomic projects, new computational tools are required to convert massive genomics data into biological insights that can lead to novel mechanistic hypotheses
MSEA is based on the notion that while it is difficult to say which marker is causal for a disease, if the markers associated with a biological process
Using Weighted key driver analysis (wKDA), we identified candidate key drivers in the liver and adipose tissues for each of the top six cholesterol-associated subnetworks

Summary

Introduction

In parallel to large-scale genomic projects, new computational tools are required to convert massive genomics data into biological insights that can lead to novel mechanistic hypotheses. The available methods are typically tailored for a particular combination of datasets (e.g. human genetics with gene expression, or human genetics with pathways or protein-protein interactions), lacking the flexibility to accommodate additional data types and multiple datasets from one or more species, tissues and platforms. Network approaches such as WGCNA and postgwas emphasize the detection of modules of co-operating genes, but validation experiments in the wet lab and therapeutic target selection require narrowing in on strong driver genes at the center of the module. The source code for Mergeomics is released as an R package (http://mergeomics.research.idre.ucla.edu/Download/Package/)

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Nov 4, 2016
Citations: 104	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Mergeomics: multidimensional data integration to identify pathogenic perturbations to biological systems.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Algorithms for differential splicing detection using exon arrays: a comparative assessment.
Karin Zimmermann ... Ulf Leser
BMC Genomics | VOL. 16
Karin Zimmermann, et. al.Karin Zimmermann ... Ulf Leser
27 Feb 2015
BMC Genomics | VOL. 16

Maximum likelihood estimation of spectra information from multiple independent cosmic ray data sets
L.W Howell
Nuclear Inst. and Methods in Physics Research, A | VOL. 538
L.W HowellL.W Howell
19 Oct 2004
Nuclear Inst. and Methods in Physics Research, A | VOL. 538

Accurate detection of spontaneous seizures using a generalized linear model with external validation.
Nicolas F Fumeaux ... Maurice Abou Jaoude
Epilepsia | VOL. 61
Nicolas F Fumeaux, et. al.Nicolas F Fumeaux ... Maurice Abou Jaoude
06 Aug 2020
Epilepsia | VOL. 61

Data from Combination of a Novel Gene Expression Signature with a Clinical Nomogram Improves the Prediction of Survival in High-Risk Bladder Cancer
Andrew Feifer ... Theresa Koppie
-
Andrew Feifer, et. al.Andrew Feifer ... Theresa Koppie
31 Mar 2023
31 Mar 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mergeomics: multidimensional data integration to identify pathogenic perturbations to biological systems.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics