Abstract

Our overall goal is to develop machine-learning approaches based on genomics and other relevant accessible information for use in predicting how a patient will respond to a given proposed drug or treatment. Given the complexity of this problem, we begin by developing, testing and analyzing learning methods using data from simulated systems, which allows us access to a known ground truth. We examine the benefits of using prior system knowledge and investigate how learning accuracy depends on various system parameters as well as the amount of training data available. The simulations are based on Boolean networks-directed graphs with 0/1 node states and logical node update rules-which are the simplest computational systems that can mimic the dynamic behavior of cellular systems. Boolean networks can be generated and simulated at scale, have complex yet cyclical dynamics and as such provide a useful framework for developing machine-learning algorithms for modular and hierarchical networks such as biological systems in general and cancer in particular. We demonstrate that utilizing prior knowledge (in the form of network connectivity information), without detailed state equations, greatly increases the power of machine-learning algorithms to predict network steady-state node values ('phenotypes') and perturbation responses ('drug effects'). Links to codes and datasets here: https://gray.mgh.harvard.edu/people-directory/71-david-craft-phd. dcraft@broadinstitute.org. Supplementary data are available at Bioinformatics online.

Highlights

  • Introduction and motivationThe ability to better predict the response of a patient, regarding both intended therapeutic effect and potential toxicities, to a candidate drug, radiation, or other treatment modality, would have immediate positive consequences for human health

  • We verified that random forest and support vector machines, the best performing algorithms, are effective in regression mode for this problem, but we focus on classification, for simplicity, and do not show any regression results

  • Random forest, support vector machine, logistic regression, lasso, and nearest cluster, are compared in Figure 4a, which clearly shows the advantage of random forest (RF) and SVM

Read more

Summary

Introduction

Introduction and motivationThe ability to better predict the response of a patient, regarding both intended therapeutic effect and potential toxicities, to a candidate drug, radiation, or other treatment modality, would have immediate positive consequences for human health. Examples of gene/disease site pairs for which target therapies exist include HER2/breast cancer, BRAF/melanoma, and EGFR in colorectal and lung. Even in these target cases, there is a wide spectrum of response to the drugs, which is due to the heterogeneity across patients and within tumors [2, 3]. The problem of predicting how a patient will respond to a particular treatment can be framed as statistical machine learning problem One can view this as a regression problem if there are quantitative measures of response or a classification problem if the response is binary, e.g.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.