Abstract

Accurately predicting the response of a cancer patient to a therapeutic agent remains an important challenge in precision medicine. With the rise of data science, researchers have applied computational models to study the drug inhibition effects on cancers based on cancer genomics and transcriptomics. Moreover, a common epigenetic modification, DNA methylation, has been related to the occurrence and development of cancer, as well as drug effectiveness. Therefore, it is helpful for improvement of drug response prediction through exploring the relationship between DNA methylation and drug effectiveness. Here, we proposed a computational model to predict drug responses in cancers through integration of cancer genomics, transcriptomics, epigenomics, and compound chemical properties. Meanwhile, we applied a regularized regression model (Least Absolute Shrinkage and Selection Operator, lasso) to detect the methylation sites that were closely related to drug effectiveness. The prediction models were trained on a well-known pharmacogenomics data resource, Genomics of Drug Sensitivity in Cancer (GDSC). The cross-validation indicates that the performance of the prediction model using DNA methylation is comparable to that of using other cancer omics, including oncogene mutation and gene expression data. It indicates the important role of DNA methylation in prediction of drug responses. Encyclopedia of DNA Elements (ENCODE) and Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining (TRRUST2) database analyses suggest that the methylation sites associated with drug effectiveness are mainly located in the transcription factor (TF) binding region. Therefore, we hypothesized that the sensitivity of cancer cells to drugs could be regulated by changing the methylation modification of TF binding region. In conclusion, we confirmed the important role of DNA methylation in prediction of drug responses, and provided some methylation sites that closely related to the drug effectiveness, which may be a great regulatory target for improvement of drug treatment effects on cancer patients.

Highlights

  • Precision medicine is a medical concept based on personalized medicine, which develops with the rapid progress of genome sequencing technology and the cross-application of biological information and big data science (Hodson, 2016)

  • Here, we assessed the contribution of DNA methylation in prediction of drug responses by comparing with that of other cancer omics via three machine learning algorithms and identified the methylation sites that were closely related to drug effectiveness through a Least Absolute Shrinkage and Selection Operator regression model, which performs both variable selection and regularization to improve the prediction accuracy and enhance the interpretability of the statistical model (Fadil and William, 1986; Tibshirani, 1996; Yvan et al, 2007; Lockhart et al, 2014)

  • Encyclopedia of DNA Elements (ENCODE) and Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining (TRRUST2) database analyses suggest that the methylation sites associated with drug effectiveness are mainly located in the transcription factor (TF) binding region

Read more

Summary

INTRODUCTION

Precision medicine is a medical concept based on personalized medicine, which develops with the rapid progress of genome sequencing technology and the cross-application of biological information and big data science (Hodson, 2016). The GDSC project provides a large-scale collection of cancer genomic data for therapeutic biomarker discovery (Yang et al, 2013) It includes mutations for 19,100 genes across 1,001 cancer cell lines, DNA copy number variations for 46,221 genes across 996 cancer cell lines, DNA methylation (β-value) for 14,725 CpG islands across 1,029 cancer cell lines, and expression for 17,737 mRNAs across 1,018 cancer cell lines (Yang et al, 2013; Iorio et al, 2016). We constructed the similarity matrix SMethy based on these DNA methylation data: SMethy(ci, cj) = exp(−α||ci − cj||2), where ci, cj are the expression profile of the i-th and j-th cancer cell lines, respectively, and α is a pre-defined parameter (set as 0.0001 here). The TRRUST2 database, the most comprehensive public database for literature-curated TF-target interactions in humans (Han et al, 2018), was introduced to test whether the methylation sites share loci with downstream gene’s TF binding region

RESULTS
DISCUSSION
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call