Area Under The Precision-recall Curve Research Articles

Abstract In early-stage hormone receptor-positive breast cancer, genomic risk scores identify patients who stand to benefit from up-front chemotherapy but introduce financial and logistical hurdles to care. We assembled a cohort of 5,244 patients with 11,671 corresponding whole-side images of breast tumors stained with hematoxylin and eosin. We developed a multimodal machine learning model to infer risk of distal metastatic recurrence from routine clinical data. Specifically, the model interprets text from the pathologist’s report using a large language model and uses self-supervised vision transformers to interpret the corresponding whole-slide image. Tensor fusion joins the modalities to infer Genomic Health’s Oncotype DX recurrence score. Inferred recurrence score from the multimodal model correlated with measured score with a concordance correlation coefficient of 0.64 (95% C.I. 0.59 - 0.69) in the withheld test set, compared to 0.55 (95% C.I. 0.49 - 0.61) and 0.56 (95% C.I. 0.52 - 0.60) for the linguistic and visual unimodal models, respectively. The multimodal model attains an area under the precision-recall curve (AUPRC) of 0.69 (AUROC=0.88) for identifying high-risk disease in the full-information setting (when images and pathology reports with quantitative hormone receptor status and grade are available) in a withheld test set, compared to AUPRC of 0.61 and 0.66 for the linguistic and visual models, respectively. By comparison, in the same full-information setting, the clinical nomogram introduced by Orucevic et al. in 2019 achieves an AUPRC of 0.48. We suggest the operating point at which precision is 94.4% and recall is 33.3%. Digitized whole-slide images of routine breast biopsies and their associated synoptic pathology reports contain much of the information necessary to stratify patients by risk of distal metastatic recurrence, when modeled appropriately. Our model could enable hospitals to rapidly triage the need for genomic risk testing, possibly precluding one third of orders without loss of accuracy. This helps allocate scarce resources for genomic tests and valuable weeks prior to beginning therapy while maintaining the standard of precision oncology. Citation Format: Kevin M. Boehm, Antonio Marra, Jorge S. Reis-Filho, Sarat Chandarlapaty, Fresia Pareja, Sohrab P. Shah. Multimodal modeling of digitized histopathology slides improves risk stratification in hormone receptor-positive breast cancer patients [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 890.

Read full abstract

BackgroundPre-operative risk assessment can help clinicians prepare patients for surgery, reducing the risk of perioperative complications, length of hospital stay, readmission and mortality. Further, it can facilitate collaborative decision-making and operational planning.ObjectiveTo develop effective pre-operative risk assessment algorithms (referred to as Patient Optimizer or POP) using Machine Learning (ML) that predict the development of post-operative complications and provide pilot data to inform the design of a larger prospective study.MethodsAfter institutional ethics approval, we developed a base model that encapsulates the standard manual approach of combining patient-risk and procedure-risk. In an automated process, additional variables were included and tested with 10-fold cross-validation, and the best performing features were selected. The models were evaluated and confidence intervals calculated using bootstrapping. Clinical expertise was used to restrict the cardinality of categorical variables (e.g. pathology results) by including the most clinically relevant values. The models were created with logistic regression (LR) and extreme gradient-boosted trees using XGBoost (Chen and Guestrin, 2016). We evaluated performance using the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC). Data was obtained from a metropolitan university teaching hospital from January 2015 to July 2020. Data collection was restricted to adult patients undergoing elective surgery.ResultsA total of 11,475 adult admissions were included. The performance of XGBoost and LR was very similar across endpoints and metrics. For predicting the risk of any post-operative complication, kidney failure and length-of-stay (LOS), POP with XGBoost achieved an AUROC (95%CI) of 0.755 (0.744, 0.767), 0.869 (0.846, 0.891) and 0.841 (0.833, 0.847) respectively and AUPRC of 0.651 (0.632, 0.669), 0.336 (0.282, 0.390) and 0.741 (0.729, 0.753) respectively. For 30-day readmission and in-patient mortality, POP with XGBoost achieved an AUROC (95%CI) of 0.610 (0.587, 0.635) and 0.866 (0.777, 0.943) respectively and AUPRC of 0.116 (0.104, 0.132) and 0.031 (0.015, 0.072) respectively.ConclusionThe POP algorithms effectively predicted any post-operative complication, kidney failure and LOS in the sample population. A larger study is justified to improve the algorithm to better predict complications and length of hospital stay. A larger dataset may also improve the prediction of additional specific complications, readmission and mortality.

Read full abstract

Area Under The Precision-recall Curve Research Articles

Related Topics

Articles published on Area Under The Precision-recall Curve

Development and external validation of deep learning clinical prediction models using variable-length time series data.

Interpretable machine learning in predicting drug-induced liver injury among tuberculosis patients: model development and validation study.

Machine learning algorithms to predict colistin-induced nephrotoxicity from electronic health records in patients with multidrug-resistant Gram-negative infection

Predicting haemoglobin deferral using machine learning models: Can we use the same prediction model across countries?

Machine learning models predict the emergence of depression in Argentinean college students during periods of COVID-19 quarantine.

VEpiNet: A multimodal interictal epileptiform discharge detection method based on video and electroencephalogram data

A novel electronic health record-based, machine-learning model to predict severe hypoglycemia leading to hospitalizations in older adults with diabetes: A territory-wide cohort and modeling study.

Deep learning to predict rapid progression of Alzheimer's disease from pooled clinical trials: A retrospective study.

CAFES: Chest X-ray Analysis using Federated Self-supervised Learning for Pediatric COVID-19 Detection.

Explainable Machine Learning Model to Preoperatively Predict Postoperative Complications in Inpatients With Cancer Undergoing Major Operations

Clinical Features Predicting COVID-19 Severity Risk at the Time of Hospitalization.

A comparative evaluation of machine learning ensemble approaches for disease prediction using multiple datasets

Clinical performance of automated machine learning: A systematic review

Predicting Successful Weaning from Mechanical Ventilation by Reduction in Positive End-expiratory Pressure Level Using Machine Learning.

Feature selection strategies: a comparative analysis of SHAP-value and importance-based methods

Abstract 890: Multimodal modeling of digitized histopathology slides improves risk stratification in hormone receptor-positive breast cancer patients

Accurate prediction of pure uric acid urinary stones in clinical context via a combination of radiomics and machine learning.

Development and validation of ‘Patient Optimizer’ (POP) algorithms for predicting surgical risk with machine learning

Synthesizing class labels for highly imbalanced credit card fraud detection data

Identifying predictors of the tooth loss phenotype in a large periodontitis patient cohort using a machine learning approach

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Area Under The Precision-recall Curve Research Articles

Related Topics

Articles published on Area Under The Precision-recall Curve

Development and external validation of deep learning clinical prediction models using variable-length time series data.

Interpretable machine learning in predicting drug-induced liver injury among tuberculosis patients: model development and validation study.

Machine learning algorithms to predict colistin-induced nephrotoxicity from electronic health records in patients with multidrug-resistant Gram-negative infection

Predicting haemoglobin deferral using machine learning models: Can we use the same prediction model across countries?

Machine learning models predict the emergence of depression in Argentinean college students during periods of COVID-19 quarantine.

VEpiNet: A multimodal interictal epileptiform discharge detection method based on video and electroencephalogram data

A novel electronic health record-based, machine-learning model to predict severe hypoglycemia leading to hospitalizations in older adults with diabetes: A territory-wide cohort and modeling study.

Deep learning to predict rapid progression of Alzheimer's disease from pooled clinical trials: A retrospective study.

CAFES: Chest X-ray Analysis using Federated Self-supervised Learning for Pediatric COVID-19 Detection.

Explainable Machine Learning Model to Preoperatively Predict Postoperative Complications in Inpatients With Cancer Undergoing Major Operations

Clinical Features Predicting COVID-19 Severity Risk at the Time of Hospitalization.

A comparative evaluation of machine learning ensemble approaches for disease prediction using multiple datasets

Clinical performance of automated machine learning: A systematic review

Predicting Successful Weaning from Mechanical Ventilation by Reduction in Positive End-expiratory Pressure Level Using Machine Learning.

Feature selection strategies: a comparative analysis of SHAP-value and importance-based methods

Abstract 890: Multimodal modeling of digitized histopathology slides improves risk stratification in hormone receptor-positive breast cancer patients

Accurate prediction of pure uric acid urinary stones in clinical context via a combination of radiomics and machine learning.

Development and validation of ‘Patient Optimizer’ (POP) algorithms for predicting surgical risk with machine learning

Synthesizing class labels for highly imbalanced credit card fraud detection data

Identifying predictors of the tooth loss phenotype in a large periodontitis patient cohort using a machine learning approach