Abstract

This paper provides evidence on the usefulness of very high spatial resolution (VHR) imagery in gathering socioeconomic information in urban settlements. We use land cover, spectral, structure and texture features extracted from a Google Earth image of Liverpool (UK) to evaluate their potential to predict Living Environment Deprivation at a small statistical area level. We also contribute to the methodological literature on the estimation of socioeconomic indices with remote-sensing data by introducing elements from modern machine learning. In addition to classical approaches such as Ordinary Least Squares (OLS) regression and a spatial lag model, we explore the potential of the Gradient Boost Regressor and Random Forests to improve predictive performance and accuracy. In addition to novel predicting methods, we also introduce tools for model interpretation and evaluation such as feature importance and partial dependence plots, or cross-validation. Our results show that Random Forest proved to be the best model with an R2 of around 0.54, followed by Gradient Boost Regressor with 0.5. Both the spatial lag model and the OLS fall behind with significantly lower performances of 0.43 and 0.3, respectively.

Highlights

  • The use of remote sensing data to gather socioeconomic information is based on the premise that the physical appearance of a human settlement is a reflection of the society that created it and on the assumption that people living in urban areas with similar physical housing conditions have similar social and demographic characteristics [1, 2]

  • We compare the results with the outcomes of two classic econometric models –Ordinary Least Squares regression (OLS) and a Spatial Lag model (SL) based on the generalized method of moments

  • We describe the main results according to the following precepts: model interpretation, to cover the output of each of the models estimated; and model performance, to assess in detail the relative advantages of each approach in predicting the Living Environment Deprivation (LED) index

Read more

Summary

Introduction

The use of remote sensing data to gather socioeconomic information is based on the premise that the physical appearance of a human settlement is a reflection of the society that created it and on the assumption that people living in urban areas with similar physical housing conditions have similar social and demographic characteristics [1, 2]. The number of studies that address the usefulness of remote sensing imagery to estimate socioeconomic variables has increased in recent years [3,4,5,6,7,8,9]. This trend is related to the increasing availability of commercial satellite platforms and the decreasing costs of this kind of data [10, 11]. Remote sensing imagery could be used as an alternative source of information in urban settings when survey data is scarce or to update socioeconomic data for different dates than those of surveys or censuses [7].

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call