Abstract

BackgroundUsing the reaction history in logistic regression and machine learning (ML) models to predict penicillin allergy has been reported based on non-United States (US) data. ObjectiveWe developed ML positive penicillin allergy testing prediction models from multi-site US data. MethodsRetrospective data from four US-based hospitals were grouped into four datasets: enriched training (1:3 case-control matched cohort), enriched testing, non-enriched internal testing, and non-enriched external testing. ML algorithms were used for model development. We determined area under the curve (AUC) and applied the Shapley Additive exPlanations (SHAP) framework to interpret risk drivers. ResultsOf 4,777 patients (mean age 60 [SD 17], 68% women, 91% White, 86% non-Hispanic) evaluated for penicillin allergy labels, 513 (11%) had positive penicillin allergy testing. Model input variables were frequently missing: immediate or delayed onset (71%), signs or symptoms (13%), and treatment (31%). The gradient boosted model was the strongest model with an AUC of 0.67 (95%CI 0.57-0.77), which improved to 0.87 (95%CI 0.73-1) when only cases with complete data were used. Top SHAP drivers for positive testing were reactions within the last year and reactions requiring medical attention; female sex and reaction of hives/urticaria were also positive drivers. ConclusionA ML prediction model for positive penicillin allergy skin testing using US-based retrospective data did not achieve performance strong enough for acceptance and adoption. The optimal ML prediction model for positive penicillin allergy testing was driven by time since reaction, seek medical attention, female sex, and hives/urticaria.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call