Abstract

Machine learning (ML) algorithms may help better understand the complex interactions among factors that influence dietary choices and behaviors. The aim of this study was to explore whether ML algorithms are more accurate than traditional statistical models in predicting vegetable and fruit (VF) consumption. A large array of features (2,452 features from 525 variables) encompassing individual and environmental information related to dietary habits and food choices in a sample of 1,147 French-speaking adult men and women was used for the purpose of this study. Adequate VF consumption, which was defined as 5 servings/d or more, was measured by averaging data from three web-based 24 h recalls and used as the outcome to predict. Nine classification ML algorithms were compared to two traditional statistical predictive models, logistic regression and penalized regression (Lasso). The performance of the predictive ML algorithms was tested after the implementation of adjustments, including normalizing the data, as well as in a series of sensitivity analyses such as using VF consumption obtained from a web-based food frequency questionnaire (wFFQ) and applying a feature selection algorithm in an attempt to reduce overfitting. Logistic regression and Lasso predicted adequate VF consumption with an accuracy of 0.64 (95% confidence interval [CI]: 0.58–0.70) and 0.64 (95%CI: 0.60–0.68) respectively. Among the ML algorithms tested, the most accurate algorithms to predict adequate VF consumption were the support vector machine (SVM) with either a radial basis kernel or a sigmoid kernel, both with an accuracy of 0.65 (95%CI: 0.59–0.71). The least accurate ML algorithm was the SVM with a linear kernel with an accuracy of 0.55 (95%CI: 0.49–0.61). Using dietary intake data from the wFFQ and applying a feature selection algorithm had little to no impact on the performance of the algorithms. In summary, ML algorithms and traditional statistical models predicted adequate VF consumption with similar accuracies among adults. These results suggest that additional research is needed to explore further the true potential of ML in predicting dietary behaviours that are determined by complex interactions among several individual, social and environmental factors.

Highlights

  • Artificial intelligence (AI) has become prominent in healthcare research, in precision medicine, for assessing disease risk, identifying potential complications or selection of treatment [1–3]

  • The rapid and successful progress in precision medicine based on Machine learning (ML) suggests promising applications in other fields including public health nutrition, where important amounts of data are already available [13], yet largely unexploited

  • The hypothesis that ML classification algorithms outperform traditional statistical classification models when predicting adequate vegetable and fruit (VF) consumption based on a wide spectrum of individual, social and environmental data was not supported by our experimental data

Read more

Summary

Introduction

Artificial intelligence (AI) has become prominent in healthcare research, in precision medicine, for assessing disease risk, identifying potential complications or selection of treatment [1–3]. The rapid and successful progress in precision medicine based on ML suggests promising applications in other fields including public health nutrition, where important amounts of data are already available [13], yet largely unexploited. ML algorithms may help achieve a more comprehensive understanding of factors that are associated with, influence or determine the quality of the diet at the individual or population level. This is an important area to explore because low quality diets are responsible for half of the deaths associated with chronic diseases globally, which is more than any other risk factors, including smoking [14]. Despite several public health efforts and policies, adhering to healthy eating remains a challenge

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call