Abstract

Post-marketing surveillance of antineoplastic agents is performed to evaluate the efficacy and safety in patients aiming at expanding drug indications and discovering potential adverse events. The real-world data is fraught with missing values. Literature addressing different strategies for dealing with missing data in such a situation is scarce. Using machine learning (ML) algorithms for predicting therapeutic outcomes of PD-1/PD-L1 Inhibitors has attracted attention. However, training a predictive model usually requires imaging or biomarker information, which is rarely available in the post-marketing surveillance data. To address these challenges, we propose an ML-aided framework to predict the outcomes of Anti-PD-1 therapy for gynecological malignancy on a dataset with 117 patient samples, treated by Camrelizumab (with 50 patient samples), Sintilimab (44), and Toripalimab (23). Four therapeutic outcomes, including Response Evaluation Criteria in Solid Tumours (RECIST), organ adverse effect (AE), general AE, and death, are predicted. The proposed framework feeds the dataset into a learning pipeline consisting of imputation, feature engineering, model training, ensemble learning, and model selection to generate the final predictive model. We conduct experiments to justify several critical design choices, such as the specific feature engineering strategies and the SMOTE over-sampling technique. The final model for each learning task is selected from a large pool of model candidates based on a joint consideration of accuracy and F1. Moreover, we conduct thorough and visualized model analysis and gain a deeper understanding of model behavior and feature importance. The results, analysis, and findings demonstrate the superiority of the proposed learning-aided framework.

Highlights

  • C ONDUCTING a post-marketing surveillance study is significant for evaluating the efficacy and safety of a drug in clinical practice [1]

  • We focus on four therapeutic outcomes as the prediction targets, including Response Evaluation Criteria in Solid Tumours (RECIST), organ adverse effect (AE), general AE, and death

  • A five-fold CV was conducted for recursive feature elimination (RFE) and hyperparameter tuning to determine the optimal set of features and hyperparameters, respectively

Read more

Summary

INTRODUCTION

C ONDUCTING a post-marketing surveillance study is significant for evaluating the efficacy and safety of a drug in clinical practice [1]. We compare these efforts in seven dimensions, including cancer type, dataset size, the number of predicted outcomes, the usage of imputation, feature enhancement, data augmentation, feature selection, and ensemble learning. For the task of outcome prediction of anti-PD-1 therapy, methods to handle missing values and more learning strategies remain to be explored. We propose a ML-based framework to perform predictive analysis on the outcomes of anti-PD1 therapy for patients with gynecological cancers, using a. A total of 15 features were collected from the patients, including age, history of hepatitis, tumor type, PD-1/PD-L1 inhibitor, lines of therapy, chemotherapy, radiotherapy, targeted therapy, start time, medication cycle, white blood cells (WBCs), aspartate aminotransferase (AST), alanine aminotransferase (ALT), thyroid-stimulating hormone (TSH), and tubercle bacillus (TB). We observe that the twelve patients with over ten medication cycles were all given Camrelizuma

OUTCOME
FEATURE ENGINEERING
RECURSIVE FEATURE ELIMINATION
MODEL EVALUATION
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call