Predicting risk of overweight or obesity in Chinese preschool-aged children using artificial intelligence techniques.

Qiong Wang,Min Yang,Mei Xue,Wenquan Niu,Bo Pang,Yicheng Zhang,Zhixin Zhang

doi:10.1007/s12020-022-03072-1

Abstract

We adopted the machine-learning algorithms and deep-learning sequential model to determine and optimize most important factors for overweight and obesity in Chinese preschool-aged children. This is a cross-sectional survey conducted in 2020 at Beijing and Tangshan. Using a stratified cluster random sampling strategy, children aged 3-6 years were enrolled. Data were analyzed using the PyCharm and Python. A total of 9478 children were eligible for inclusion, including 1250 children with overweight or obesity. All children were randomly divided into the training group and testing group at a 6:4 ratio. After comparison, support vector machine (SVM) outperformed the other algorithms (accuracy: 0.9457), followed by gradient boosting machine (GBM) (accuracy: 0.9454). As reflected by other 4 performance indexes, GBM had the highest F1 score (0.7748), followed by SVM with F1 score at 0.7731. After importance ranking, the top 5 factors seemed sufficient to obtain descent performance under GBM algorithm, including age, eating speed, number of relatives with obesity, sweet drinking, and paternal education. The performance of the top 5 factors was reinforced by the deep-learning sequential model. We have identified 5 important factors that can be fed to GBM algorithm to better differentiate children with overweight or obesity from the general children, with decent prediction performance.

Full Text