Exploring Integrative Analysis Using the BioMedical Evidence Graph.

Adam Struck,Ryan Spangler,Brian Walsh,Kyle Ellrott,Jordan A Lee,Alexander Buchanan,Joshua M Stuart

doi:10.1200/cci.19.00110

Abstract

PURPOSEThe analysis of cancer biology data involves extremely heterogeneous data sets, including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenetic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrated data set analysis.METHODSWe introduce the BioMedical Evidence Graph (BMEG), a graph database and query engine for discovery and analysis of cancer biology. The BMEG is unique from other biologic data graphs in that sample-level molecular and clinical information is connected to reference knowledge bases. It combines gene expression and mutation data with drug-response experiments, pathway information databases, and literature-derived associations.RESULTSThe construction of the BMEG has resulted in a graph containing > 41 million vertices and 57 million edges. The BMEG system provides a graph query–based application programming interface to enable analysis, with client code available for Python, Javascript, and R, and a server online at bmeg.io. Using this system, we have demonstrated several forms of cross–data set analysis to show the utility of the system.CONCLUSIONThe BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug-response machine learning, patient-level knowledge-base queries, and pathway level analysis. We have compared the resulting graph to other available integrated graph systems and demonstrated the former is unique in the scale of the graph and the type of data it makes available.

Highlights

Biological data produced by large-scale projects routinely reaches petabyte levels thanks to major advances in sequencing and imaging
The analysis of cancer biology data involves extremely heterogeneous datasets including information. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrative analysis
We introduce a graph database and query engine for discovery and analysis of cancer biology, called the BioMedical Evidence Graph (BMEG)

Summary

Introduction

Biological data produced by large-scale projects routinely reaches petabyte levels thanks to major advances in sequencing and imaging. This exponential growth in size is well-documented and is being addressed by multiple big-data initiatives. The immense and expansive amount of heterogeneous data make it difficult to normalize and integrate data as well as perform integrative analysis across disparate experiments. When faced with these challenges as well as the substantial labor and computation costs, researchers may use only a fraction of publicly available data for their analysis, and will not update their data or analysis as new data becomes available

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: JCO Clinical Cancer Informatics	Publication Date: Feb 25, 2020
Citations: 5	License type: cc-by

R Discovery Prime

Exploring Integrative Analysis Using the BioMedical Evidence Graph.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: JCO Clinical Cancer Informatics

Lead the way for us

Similar Papers

Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods
Yimei Jiang ... Xiaowei Yan
BMC Cancer | VOL. 20
Yimei Jiang, et. al.Yimei Jiang ... Xiaowei Yan
19 Oct 2020
BMC Cancer | VOL. 20

Identifying gene network rewiring by combining gene expression and gene mutation data.
Jia-Juan Tu ... Xiao-Fei Zhang
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 16
Jia-Juan Tu, et. al.Jia-Juan Tu ... Xiao-Fei Zhang
09 May 2018
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 16

Identifying Drug Sensitivity Subnetworks with NETPHIX
Yoo-Ah Kim ... Fabio Vandin
iScience | VOL. 23
Yoo-Ah Kim, et. al.Yoo-Ah Kim ... Fabio Vandin
29 Sep 2020
iScience | VOL. 23

Abstract 789: Leveraging transcriptomic and genomic data to better select models for preclinical oncology therapeutic development to identify cell lines most similar to patient tumors
Yoonjeong Cha ... Brian Haas
Cancer Research | VOL. 76
Yoonjeong Cha, et. al.Yoonjeong Cha ... Brian Haas
15 Jul 2016
Cancer Research | VOL. 76

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Exploring Integrative Analysis Using the BioMedical Evidence Graph.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: JCO Clinical Cancer Informatics