BackgroundThe fact that most older people do not live long means that they do not have more time to pursue self-actualization and contribute value to society. Although there are many studies on the longevity of the elderly, the limitations of traditional statistics lack the good ability to study together the important influencing factors and build a simple and effective prediction model.MethodsBased on the the data of Chinese Longitudinal Healthy Longevity Survey (CLHLS), 2008–2018 cohort and 2014–2018 cohort were selected and 16 features were filtered and integrated. Five machine learning algorithms, Elastic-Net Regression (ENR), Decision Tree (DT), Random Forest (RF), K-Nearest Neighbor (KNN), and eXtreme Gradient Boosting (XGBoost), were used to develop models and assessed by internal validation with CLHLS 2008–2018 cohort and temporal validation with CLHLS 2014–2018 cohort. Besides, the best performing model was explained and according to the variable importance results, simpler models would be developed.ResultsThe results showed that the model developed by XGBoost algorithm had the best performance with AUC of 0.788 in internal validation and 0.806 in temporal validation. Instrumental activity of daily living (IADL), leisure activity, marital status, sex, activity of daily living (ADL), cognitive function, overall plant-based diet index (PDI) and psychological resilience, 8 features were more important in the model. Finally, with these 8 features simpler models were developed, it was found that the model performance did not decrease in both internal and temporal validation.ConclusionsThe study indicated that the importance of these 8 factors for predicting the death of elderly people in China and built a simple machine learning model with good predictive performance. It can inspire future key research directions to promote longevity of the elderly, as well as in practical life to make the elderly healthy longevity, or timely end-of-life care for the elderly, and can use predictive model to aid decision-making.
Read full abstract