Abstract

Prediction of clinical drug response (CDR) of cancer patients, based on their clinical and molecular profiles obtained prior to administration of the drug, can play a significant role in individualized medicine. Machine learning models have the potential to address this issue but training them requires data from a large number of patients treated with each drug, limiting their feasibility. While large databases of drug response and molecular profiles of preclinical in-vitro cancer cell lines (CCLs) exist for many drugs, it is unclear whether preclinical samples can be used to predict CDR of real patients. We designed a systematic approach to evaluate how well different algorithms, trained on gene expression and drug response of CCLs, can predict CDR of patients. Using data from two large databases, we evaluated various linear and non-linear algorithms, some of which utilized information on gene interactions. Then, we developed a new algorithm called TG-LASSO that explicitly integrates information on samples’ tissue of origin with gene expression profiles to improve prediction performance. Our results showed that regularized regression methods provide better prediction performance. However, including the network information or common methods of including information on the tissue of origin did not improve the results. On the other hand, TG-LASSO improved the predictions and distinguished resistant and sensitive patients for 7 out of 13 drugs. Additionally, TG-LASSO identified genes associated with the drug response, including known targets and pathways involved in the drugs’ mechanism of action. Moreover, genes identified by TG-LASSO for multiple drugs in a tissue were associated with patient survival. In summary, our analysis suggests that preclinical samples can be used to predict CDR of patients and identify biomarkers of drug sensitivity and survival.

Highlights

  • Cancer is one of the leading causes of death globally and is expected to be the most important obstacle in increasing the life expectancy in the 21st century [1]

  • Cancer is among the leading causes of death globally and prediction of the drug response of patients to different treatments based on their clinical and molecular profiles can enable individualized cancer medicine

  • We focused on drugs that were shared between these two datasets and utilized the gene expression profiles of samples to predict the drug response, since previous studies have demonstrated gene expression to be most informative for this task [7]

Read more

Summary

Introduction

Cancer is one of the leading causes of death globally and is expected to be the most important obstacle in increasing the life expectancy in the 21st century [1]. Individualized cancer medicine has the potential to revolutionize patient prognosis; two major challenges in this area include the prediction of the individual responses to different treatments and the identification of molecular biomarkers of drug sensitivity. While factors such as cancer type or its symptoms have been traditionally used to identify the treatment [2], the development of high throughput sequencing technologies [3] and sophisticated machine learning (ML) approaches present the possibility of individualizing treatment based on molecular ‘omics’ profiles of patients’ tumors [4]. Our goal in this study was to perform an unbiased systematic evaluation on a panel of drugs to determine 1) whether regression models trained on in vitro preclinical samples can predict the CDR of real patients for each drug and 2) what type of side information (e.g. interaction of the genes, the tissue of origin of samples) might improve the CDR prediction

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call