Abstract

BACKGROUND AND AIM: Type 2 diabetes (T2D) is a chronic disease with high individual and societal burden. Risk of T2D is partly influenced by urban characteristics, such as air pollution, or indirectly through their relation with lifestyle behaviours (e.g. walkability, food environment). Environmental characteristics are generally studied individually in their association with T2D, but occur simultaneously in real life, and may associate with T2D in nonlinear and non-additive ways. We aimed to identify which factors of the urban exposome are associated with T2D by applying artificial neural networks (ANN). In addition, we compared results of ANN model with penalized regression LASSO, as a more conventional method. METHODS: We analyzed baseline data from 14,829 participants of the Occupational and Environmental Health Cohort study living across the Netherlands. Self-reported questionnaire data were used to identify participants diagnosed with T2D (n=676(4.6%)). Exposome variables (n=86) were linked to individual home addresses, including air pollution, traffic noise, green-space, chemicals in drinking water, built environmental and neighborhood socio-demographic characteristics. Models were adjusted for individual socio-demographic variables. Nested cross-validation was used to determine the optimal model parameters of both approaches (ANN and LASSO), and the cross-validated predictive accuracies were compared. RESULTS: One exposure was selected by each approach. Living in neighborhoods with a higher share of non-Western immigrants (selected by ANN) was associated with a higher risk of T2D. Higher average home value in residential neighborhood (selected by LASSO) was associated with a lower risk of T2D. Cross-validated prediction error logLoss (sd) was 0.177 (0.0060) for ANN and 0.167 (0.0022) for LASSO. CONCLUSIONS: Neighborhood socio-demographic characteristics are associated with the risk of T2D. Accuracy of the ANN model was lower than LASSO, which might be due to the low prevalence of diabetes and relatively weak signal in the data. KEYWORDS: Multiple exposures, Machine learning, Neighborhood socio-demographic characteristics

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call