Abstract

To investigate high-risk sociodemographic and environmental determinants of health (SEDH) potentially associated with adult obesity in counties in the United States using machine-learning techniques. We performed a cross-sectional analysis of county-level adult obesity prevalence (body mass index ≥30 kg/m2) in the United States using data from the Diabetes Surveillance System 2017. We harvested 49 county-level SEDH factors that were used in a classification and regression trees (CART) model to identify county-level clusters. The CART model was validated using a 'hold-out' set of counties and variable importance was evaluated using Random Forest. Overall, we analysed 2752 counties in the United States, identifying a national median (interquartile range) obesity prevalence of 34.1% (30.2%, 37.7%). The CART method identified 11 clusters with a 60.8% relative increase in prevalence across the spectrum. Additionally, seven key SEDH variables were identified by CART to guide the categorization of clusters, including Physically Inactive (%), Diabetes (%), Severe Housing Problems (%), Food Insecurity (%), Uninsured (%), Population over 65 years (%) and Non-Hispanic Black (%). There is significant county-level geographical variation in obesity prevalence in the United States, which can in part be explained by complex SEDH factors. The use of machine-learning techniques to analyse these factors can provide valuable insights into the importance of these upstream determinants of obesity and, therefore, aid in the development of geo-specific strategic interventions and optimize resource allocation to help battle the obesity pandemic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call