Data from Bimodal Gene Expression in Patients with Cancer Provides Interpretable Biomarkers for Drug Sensitivity

Benjamin Haibe-Kains,Wail Ba-Alawi,Sisira Kadambat Nair,Arvind Singh Mer,Anthony Mammoliti,Petr Smirnov,Linda Z. Penn,Bo Li

doi:10.1158/0008-5472.c.6514155.v1

Abstract

<div>AbstractIdentifying biomarkers predictive of cancer cell response to drug treatment constitutes one of the main challenges in precision oncology. Recent large-scale cancer pharmacogenomic studies have opened new avenues of research to develop predictive biomarkers by profiling thousands of human cancer cell lines at the molecular level and screening them with hundreds of approved drugs and experimental chemical compounds. Many studies have leveraged these data to build predictive models of response using various statistical and machine learning methods. However, a common pitfall to these methods is the lack of interpretability as to how they make predictions, hindering the clinical translation of these models. To alleviate this issue, we used the recent logic modeling approach to develop a new machine learning pipeline that explores the space of bimodally expressed genes in multiple large in vitro pharmacogenomic studies and builds multivariate, nonlinear, yet interpretable logic-based models predictive of drug response. The performance of this approach was showcased in a compendium of the three largest in vitro pharmacogenomic datasets to build robust and interpretable models for 101 drugs that span 17 drug classes with high validation rates in independent datasets. These results along with in vivo and clinical validation support a better translation of gene expression biomarkers between model systems using bimodal gene expression.Significance:A new machine learning pipeline exploits the bimodality of gene expression to provide a reliable set of candidate predictive biomarkers with a high potential for clinical translatability.</div>

Full Text