A machine learning case study to predict rare clinical event of interest: imbalanced data, interpretability, and practical considerations

Sheng Zhong,Jane Zhang,Jenny Jiao,Hongjian Zhu,Yunzhao Xing,Li Wang

doi:10.1080/10543406.2024.2364722

Abstract

ABSTRACT Accurate prediction of a rare and clinically important event following study treatment has been crucial in drug development. For instance, the rarity of an adverse event is often commensurate with the seriousness of medical consequences, and delayed detection of the rare adverse event can pose significant or even life-threatening health risks to patients. In this machine learning case study, we demonstrate with an example originated from a real clinical trial setting how to define and solve the rare clinical event prediction problem using machine learning in pharmaceutical industry. The unique contributions of this work include the proposal of a six-step investigation framework that facilitates the communication with non-technical stakeholders and the interpretation of the model performance in terms of practical consequences in the context of patient screenings for conducting a future clinical trial. In terms of machine learning methodology, for data splitting into the training and test sets, we adapt the rare-event stratified split approach (from scikit-learn) to further account for group splitting for multiple records of a patient simultaneously. To handle imbalanced data due to rare events in model training, the cost-sensitive learning approach is employed to give more weights to the minor class and the metrics precision together with recall are used to capture prediction performance instead of the raw accuracy rate. Finally, we demonstrate how to apply the state-of-the-art SHAP values to identify important risk factors to improve model interpretability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A machine learning case study to predict rare clinical event of interest: imbalanced data, interpretability, and practical considerations

Abstract

Talk to us

Similar Papers

More From: Journal of Biopharmaceutical Statistics

Lead the way for us

Similar Papers

Interpretable machine learning models for failure cause prediction in imbalanced oil pipeline data
Bright Awuku ... Nita Yodo
Measurement Science and Technology | VOL. 35
Bright Awuku, et. al.Bright Awuku ... Nita Yodo
24 Apr 2024
Measurement Science and Technology | VOL. 35

Cost-sensitive learning for imbalanced medical data: a review
Imane Araf ... Ikram Chairi
Artificial Intelligence Review | VOL. 57
Imane Araf, et. al.Imane Araf ... Ikram Chairi
01 Mar 2024
Artificial Intelligence Review | VOL. 57

Standardizing Safety Assessment and Reporting for Neonatal Clinical Trials
Jonathan M Davis ... Ron Portman
The Journal of Pediatrics | VOL. 219
Jonathan M Davis, et. al.Jonathan M Davis ... Ron Portman
08 Nov 2019
The Journal of Pediatrics | VOL. 219

Multimodal data for systolic and diastolic blood pressure prediction: The hypertension conscious artificial intelligence.
Quincy A Hathaway ... Naveena Yanamala
EBioMedicine | VOL. 84
Quincy A Hathaway, et. al.Quincy A Hathaway ... Naveena Yanamala
13 Sep 2022
EBioMedicine | VOL. 84

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A machine learning case study to predict rare clinical event of interest: imbalanced data, interpretability, and practical considerations

Abstract

Talk to us

Similar Papers

More From: Journal of Biopharmaceutical Statistics