AimsFew personalized monitoring models for valproic acid (VPA) in pediatric epilepsy patients (PEPs) incorporate machine learning (ML) algorithms. This study aimed to develop an ensemble ML model for VPA monitoring to enhance clinical precision of VPA usage.MethodsA dataset comprising 366 VPA trough concentrations from 252 PEPs, along with 19 covariates and the target variable (VPA trough concentration), was refined by Spearman correlation and multicollinearity testing (366 × 11). The dataset was split into a training set (292) and testing set (74) at a ratio of 8:2. An ensemble model was formulated by Gradient Boosting Regression Trees (GBRT), Random Forest Regression (RFR), and Support Vector Regression (SVR), and assessed by SHapley Additive exPlanations (SHAP) analysis for covariate importance. The model was optimized for R2, relative accuracy, and absolute accuracy, and validated against two independent external datasets (32 in-hospital and 28 out-of-hospital dataset).ResultsUsing the R2 weight ratio of GBRT, RFR and SVR optimized at 5:2:3, the ensemble model demonstrated superior performance in terms of relative accuracy (87.8%), absolute accuracy (78.4%), and R2 (0.50), while also exhibiting a lower Mean Absolute Error (9.87) and Root Mean Squared Error (12.24), as validated by the external datasets. Platelet count (PLT) and VPA daily dose were identified as pivotal covariates.ConclusionThe proposed ensemble model effectively monitors VPA trough concentrations in PEPs. By integrating covariates across various ML algorithms, it delivers results closely aligned with clinical practice, offering substantial clinical value for the guided use of VPA.
Read full abstract