Asthma is the most common multifactorial chronic disease among children. Identifying children at high risk of severe asthma outcomes, such as emergency department (ED) visits and hospitalizations due to asthma exacerbation, is essential in asthma care and clinical management. Existing studies have employed different machine learning methods to predict pediatric asthma occurrence or progression using electronic health records (EHR) data. However, these studies often neglected the correlated nature of EHR data (e.g., repeated clinic visits of the same patients). To address this issue, this research applied and evaluated two types of machine learning-based methods for longitudinal or clustered data, including random forests with mixed effects and generalized neural networks with mixed effects. We applied these methods to the real-world large asthma EHR data obtained from the Children’s Hospital of Pittsburgh in a four-year period expanded from pre to post-COVID-19 pandemic, focusing on predicting the chance of having ED visits due to asthma exacerbation and the length of stay (LOS) when hospitalized. Moreover, we characterized the importance of predictors using the kernel SHAP metric and identified vulnerable patient groups that are more likely to experience asthma exacerbation or have a longer LOS. Our findings provide valuable guidance to improve pediatric asthma care by prioritizing the protection of these vulnerable patients, especially when a disruptive health crisis occurs. Journal of Statistical Research 2024, Vol. 58, No. 1, pp. 131-149.
Read full abstract