IntroductionAdverse drug events (ADEs) pose a significant challenge in current clinical practice. Machine learning (ML) has been increasingly used to predict specific ADEs using electronic health record (EHR) data. This systematic review provides a comprehensive overview of the application of ML in predicting specific ADEs based on EHR data.MethodsA systematic search of PubMed, Web of Science, Embase, and IEEE Xplore was conducted to identify relevant articles published from the inception to 20 May 2024. Studies that developed ML models for predicting specific ADEs or ADEs associated with particular drugs were included using EHR data.ResultsA total of 59 studies met the inclusion criteria, covering 15 drugs and 15 ADEs. In total, 38 machine learning algorithms were reported, with random forest (RF) being the most frequently used, followed by support vector machine (SVM), eXtreme gradient boosting (XGBoost), decision tree (DT), and light gradient boosting machine (LightGBM). The performance of the ML models was generally strong, with an average area under the curve (AUC) of 76.68% ± 10.73, accuracy of 76.00% ± 11.26, precision of 60.13% ± 24.81, sensitivity of 62.35% ± 20.19, specificity of 75.13% ± 16.60, and an F1 score of 52.60% ± 21.10. The combined sensitivity, specificity, diagnostic odds ratio (DOR), and AUC from the summary receiver operating characteristic (SROC) curve using a random effects model were 0.65 (95% CI: 0.65–0.66), 0.89 (95% CI: 0.89–0.90), 12.11 (95% CI: 8.17–17.95), and 0.8069, respectively. The risk factors associated with different drugs and ADEs varied.DiscussionFuture research should focus on improving standardization, conducting multicenter studies that incorporate diverse data types, and evaluating the impact of artificial intelligence predictive models in real-world clinical settings.Systematic Review Registrationhttps://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42024565842, identifier CRD42024565842.
Read full abstract