Abstract

The co-expression extrapolation (COXEN) method has been successfully used in multiple studies to select genes for predicting the response of tumor cells to a specific drug treatment. Here, we enhance the COXEN method to select genes that are predictive of the efficacies of multiple drugs for building general drug response prediction models that are not specific to a particular drug. The enhanced COXEN method first ranks the genes according to their prediction power for each individual drug and then takes a union of top predictive genes of all the drugs, among which the algorithm further selects genes whose co-expression patterns are well preserved between cancer cases for building prediction models. We apply the proposed method on benchmark in vitro drug screening datasets and compare the performance of prediction models built based on the genes selected by the enhanced COXEN method to that of models built on genes selected by the original COXEN method and randomly picked genes. Models built with the enhanced COXEN method always present a statistically significantly improved prediction performance (adjusted p-value ≤ 0.05). Our results demonstrate the enhanced COXEN method can dramatically increase the power of gene expression data for predicting drug response.

Highlights

  • Cancer is a heterogeneous disease at both the histologic and genetic levels

  • To evaluate the proposed enhanced co-expression extrapolation (COXEN) method, we applied it on three benchmark in vitro drug screening datasets, the Cancer Cell Line Encyclopedia (CCLE) dataset [18], the Genentech Cell response

  • We developed an enhanced COXEN method to select predictive and generalizable genes for building general drug response prediction models

Read more

Summary

Introduction

Cancer is a heterogeneous disease at both the histologic and genetic levels. Patients with the same cancer histology can respond differently to the same treatment. Accurate prediction of a patient’s response to a drug treatment is of paramount importance to the success of precision oncology. Multiple types of tumor omics data have been used in many studies for predicting anti-cancer drug response [1,2,3,4,5], among which transcriptome data have been shown to be the most important for drug response prediction [6,7]. Because the transcriptome data usually contain the expression values of about 20,000 genes, which can be computationally expensive for training prediction models and cause model overfitting on data without a large number of samples, gene selection is frequently applied to select a group of genes most useful for the prediction of drug response [6,8,9].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call