Neighborhood greenspace benefits health, yet few tools are available for estimating the health consequences of community greenspace design alternatives, especially prior to the implementation of a landscape design plan. Herein we present a machine learning based tool for predicting the prevalence of non-communicable diseases based on landscape design maps at the community scale. We achieve this based on data collected in five major metropolises in the United States. By using high-resolution satellite imagery and remote sensing technologies, landscape spatial design characteristics were extracted through spatial pattern analysis, allowing the measurement of verdancy, fragmentation, connectedness, aggregation, and shape of greenspaces. We established a model to estimate the prevalence of poor mental health, coronary heart disease, stroke, diabetes, chronic obstructive pulmonary disease, and physical inactivity at the census tract level by adopting a combination of random forest decision tree algorithm and spatial Gaussian process models. Model accuracy was found to be significantly higher than ordinary regression models. The model accounted for very high levels of variance in the specified morbidities; viz., poor mental health (97%), heart disease (93%), stroke (93%), diabetes (95%), COPD (94%), and physical inactivity (98%). This tool is implemented using a freely available programming language (R), and we offer the model accessible to the public. This tool enables urban planners and landscape designers to assess and compare the health effects of different greenspace design plans prior to implementation, thus providing policymakers and designers with evidence-rich alternatives during the health-promoting greenspace planning process.
Read full abstract