Despite applying conventional predictive methodologies to obtain genomic insights, predicting drug sensitivity for healthcare organizations in the USA remains a daunting challenge. Cancer is a highly dynamic, adaptive disease tumor cells have repeatedly shown the capability to evolve mechanisms whereby therapeutic interventions can be evaded. Besides, one genomic alteration seldom predicts drug sensitivity. This research project aimed to address the challenges of predicting drug sensitivity by leveraging the GDSC dataset, an extensive resource connecting genomic profiles of cancer cell lines with their sensitivity to a wide range of anti-cancer drugs. This research's key focus was identifying robust genomic markers, including any specific mutations, gene expression patterns, or epigenetic modifications associated with drug sensitivity or resistance. Advanced machine learning and statistical methods were utilized by the predictive models to analyze complex relations that may exist between different genomic alterations and their drug sensitivity. The dataset used for this research project was derived from the Kaggle website. This dataset was compiled by the research project Genomics of Drug Sensitivity in Cancer collaboration between the Sanger Institute in the United Kingdom and the Massachusetts General Hospital Cancer Center in the United States. In their investigation, there was a massive screening of human cancer cell lines with a wide range of anti-cancer drugs. Data collection was performed by large-scale screening of diverse anti-cancer drugs against human cancer cell lines of various types. Cell viability was measured using the Cell-Titer-Glo assay following 72 hours of drug treatment. Several machine learning models were deployed, namely, Random Forest, Linear Regression, and XG-Boost, which exhibited specific strengths. Specific performance metrics used included MSE, RMSE, MAE, and R². As the statistics indicate, among the three models, Random Forest stands out and performs the best on this dataset across all metrics. A smaller value of MAE, MSE, and RMSE signifies that it provided the best forecast for the target variable. It also gave the highest R-squared value. Application of drug sensitivity prediction analysis in cancer can provide an overview of the mechanisms that underlie both tumor response and resistance by investigating the model predictions. The proposed predictive models have the potential to make significant impacts on clinical decision-making in cancer therapy. Predictive models derive informed decisions regarding a patient's risk of recurrence of disease, their response to certain therapies, and their prognosis based on complex clinical and genomic data. Keywords: Genomic Predictors; Drug Sensitivity; Genomic Markers; Personalized medicine; Machine Learning; Random Forest Algorithm.
Read full abstract