The value of prior knowledge in machine learning of complex network systems.

Dana Ferranti,David Krane,David Craft

doi:10.1093/bioinformatics/btx438

Abstract

Our overall goal is to develop machine-learning approaches based on genomics and other relevant accessible information for use in predicting how a patient will respond to a given proposed drug or treatment. Given the complexity of this problem, we begin by developing, testing and analyzing learning methods using data from simulated systems, which allows us access to a known ground truth. We examine the benefits of using prior system knowledge and investigate how learning accuracy depends on various system parameters as well as the amount of training data available. The simulations are based on Boolean networks-directed graphs with 0/1 node states and logical node update rules-which are the simplest computational systems that can mimic the dynamic behavior of cellular systems. Boolean networks can be generated and simulated at scale, have complex yet cyclical dynamics and as such provide a useful framework for developing machine-learning algorithms for modular and hierarchical networks such as biological systems in general and cancer in particular. We demonstrate that utilizing prior knowledge (in the form of network connectivity information), without detailed state equations, greatly increases the power of machine-learning algorithms to predict network steady-state node values ('phenotypes') and perturbation responses ('drug effects'). Links to codes and datasets here: https://gray.mgh.harvard.edu/people-directory/71-david-craft-phd. dcraft@broadinstitute.org. Supplementary data are available at Bioinformatics online.

Highlights

Introduction and motivationThe ability to better predict the response of a patient, regarding both intended therapeutic effect and potential toxicities, to a candidate drug, radiation, or other treatment modality, would have immediate positive consequences for human health
We verified that random forest and support vector machines, the best performing algorithms, are effective in regression mode for this problem, but we focus on classification, for simplicity, and do not show any regression results
Random forest, support vector machine, logistic regression, lasso, and nearest cluster, are compared in Figure 4a, which clearly shows the advantage of random forest (RF) and SVM

Summary

Introduction

Introduction and motivationThe ability to better predict the response of a patient, regarding both intended therapeutic effect and potential toxicities, to a candidate drug, radiation, or other treatment modality, would have immediate positive consequences for human health. Examples of gene/disease site pairs for which target therapies exist include HER2/breast cancer, BRAF/melanoma, and EGFR in colorectal and lung. Even in these target cases, there is a wide spectrum of response to the drugs, which is due to the heterogeneity across patients and within tumors [2, 3]. The problem of predicting how a patient will respond to a particular treatment can be framed as statistical machine learning problem One can view this as a regression problem if there are quantitative measures of response or a classification problem if the response is binary, e.g.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Jul 7, 2017
Citations: 19	License type: cc-by

R Discovery Prime

R Discovery Prime

The value of prior knowledge in machine learning of complex network systems.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Sensitivity analysis of prior model probabilities and the value of prior knowledge in the assessment of conceptual model uncertainty in groundwater modelling
Rodrigo Rojas ... Luc Feyen
Hydrological Processes | VOL. 23
Rodrigo Rojas, et. al.Rodrigo Rojas ... Luc Feyen
14 Jan 2009
Hydrological Processes | VOL. 23

Learning and memory of factual content from narrative and expository text
Michael B W Wolfe ... Joseph A Mienko
British Journal of Educational Psychology | VOL. 77
Michael B W Wolfe, et. al.Michael B W Wolfe ... Joseph A Mienko
01 Sep 2007
British Journal of Educational Psychology | VOL. 77

Identification of Boolean Network Models From Time Series Data Incorporating Prior Knowledge.
Thomas Leifeld ... Zhihua Zhang
Frontiers in Physiology | VOL. 9
Thomas Leifeld, et. al.Thomas Leifeld ... Zhihua Zhang
08 Jun 2018
Frontiers in Physiology | VOL. 9

Boolean network inference from time series data incorporating prior biological knowledge
Saad Haider ... Ranadip Pal
BMC Genomics | VOL. 13
Saad Haider, et. al.Saad Haider ... Ranadip Pal
01 Jan 2012
BMC Genomics | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The value of prior knowledge in machine learning of complex network systems.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics