This study develops machine learning (ML) models to predict hypoxemia severity during emergency triage, particularly in Chemical, Biological, Radiological, Nuclear, and Explosive (CBRNE) scenarios, using physiological data from medical-grade sensors. Tree-based models (TBMs) such as XGBoost, LightGBM, CatBoost, Random Forests (RFs), Voting Classifier ensembles, and sequential models (LSTM, GRU) were trained on the MIMIC-III and IV datasets. A preprocessing pipeline addressed missing data, class imbalances, and synthetic data flagged with masks. Models were evaluated using a 5-min prediction window with minute-level interpolations for timely interventions. TBMs outperformed sequential models in speed, interpretability, and reliability, making them better suited for real-time decision-making. Feature importance analysis identified six key physiological variables from the enhanced NEWS2+ score and emphasized the value of mask and score features for transparency. Voting Classifier ensembles showed slight metric gains but did not outperform individually optimized models, facing a precision-sensitivity tradeoff and slightly lower F1-scores for key severity levels. TBMs were effective for real-time hypoxemia prediction, while sequential models, though better at temporal handling, were computationally costly. This study highlights ML's potential to improve triage systems and reduce alarm fatigue, with future plans to incorporate multi-hospital datasets for broader applicability.
Read full abstract