Iterative sure independent ranking and screening for drug response prediction

Biao An,Yufang Qin,Yun Fang,Qianwen Zhang,Ming Chen

doi:10.1186/s12911-020-01240-9

Biao An, Yufang Qin + Show 3 more

Open Access

PDF Available

https://doi.org/10.1186/s12911-020-01240-9

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundPrediction of drug response based on multi-omics data is a crucial task in the research of personalized cancer therapy.ResultsWe proposed an iterative sure independent ranking and screening (ISIRS) scheme to select drug response-associated features and applied it to the Cancer Cell Line Encyclopedia (CCLE) dataset. For each drug in CCLE, we incorporated multi-omics data including copy number alterations, mutation and gene expression and selected up to 50 features using ISIRS. Then a linear regression model based on the selected features was exploited to predict the drug response. Cross validation test shows that our prediction accuracies are higher than existing methods for most drugs.ConclusionsOur study indicates that the features selected by the marginal utility measure, which measures the conditional probability of drug responses given the feature, are helpful for drug response prediction.

Highlights

Prediction of drug response based on multi-omics data is a crucial task in the research of personalized cancer therapy
We propose the iterative sure independent ranking and screening (SIRS) (ISIRS) to predict the drug response and apply it to the Cell Line Encyclopedia (CCLE) dataset
We further propose the scheme of iterative sure independent ranking and screening (ISIRS) as follows

Summary

Introduction

Prediction of drug response based on multi-omics data is a crucial task in the research of personalized cancer therapy. Researchers have tried many methods to find biomarkers and predict drug sensitivity. These methods are mainly based on gene expression measurements. Staunton et al proposed a weighted voting classification strategy to classify each cell line as sensitive or resistant for each drug based on the NCI-60 gene expression data [2]. Menden et al [9] developed a machine learning model to predict the response of cancer cell lines to drug treatment based on both the genomic features of cell lines and chemical properties of the considered drugs. In spite of the success in finding some drug biomarkers, these kinds of approaches still suffer from the typical problem of “high-dimension but low sample size” problem in statistical learning, i.e., compared with the large number of expression genes and chemical compounds (p), the number of samples (n) is very limited

Objectives

Methods

Results