Non-Fatal Drowning Risk Prediction Based on Stacking Ensemble Algorithm.

Xinshan Xie,Lihua Yin,Wei Wu,Dandan Peng,Zhixing Li,Qingsong Chen,Haofeng Xu,Wenjun Ma,Ruilin Meng

doi:10.3390/children9091383

Abstract

Drowning is a major public health problem and a leading cause of death in children living in developing countries. We seek better machine learning (ML) algorithms to provide a novel risk-assessment insight on non-fatal drowning prediction. The data on non-fatal drowning were collected in Qingyuan city, Guangdong Province, China. We developed four ML models to predict the non-fatal drowning risk, including a logistic regression model (LR), random forest model (RF), support vector machine model (SVM), and stacking-based model, on three primary learners (LR, RF, SVM). The area under the curve (AUC), F1 value, accuracy, sensitivity, and specificity were calculated to evaluate the predictive ability of the different learning algorithms. This study included a total of 8390 children. Of those, 12.07% (1013) had experienced non-fatal drowning. We found the following risk factors are closely associated with the risk of non-fatal drowning: the frequency of swimming in open water, distance between the school and the surrounding open waters, swimming skills, personality (introvert) and relationality with family members. Compared to the other three base models, the stacking generalization model achieved a superior performance in the non-fatal drowning dataset (AUC = 0.741, sensitivity = 0.625, F1 value = 0.359, accuracy = 0.739 and specificity = 0.754). This study indicates that applying stacking ensemble algorithms in the non-fatal drowning dataset may outperform other ML models.

Full Text