Abstract

Estimating the per-capita income and the household income at a fine-grained geographical scale is critical but challenging, even across the developed economies. In this article, a novel Siamese-like Convolutional Neural Network, integrating Ridge Regression and Gaussian Process Regression, has been developed for fine-grained estimation of income across different parts of New York City. Our model (the GP-Mixed-Siamese-like-Double-Ridge model) makes good use of the pairwise comparison of location-based house price information, daytime satellite image, street view and spatial location information as the inputs. Taking the per-capita income and the median household income in New York City as the ground truths, our model outperforms (R 2 = 0.72-0.86 for five-fold validation) other state-of-the-art income estimation models and achieves good performance in cross-district and cross-scale validation. We also find that models which partially share our model architecture, including the Spatial-Information-GP and the Mixed-Siamese-like model, perform well under certain spatial granularity and data availability. Since such models rely on less data input types and simpler architectures, they can be used to save resources on data collection and model training. Hence, using our model for fine-grained income estimation does not mean excluding these models that share similar architectures. Our fine-grained income estimation model can allow the per-capita and the household income data generated in fine-grained resolution to couple with other types of data, such as the air pollution or the epidemic data, of the same scale, to ensure that any location-specific socio-economic-related study and evidence-based decision-making at the fine-grained resolution can be conducted. Future research will focus on extending our model for fine-grained income estimation in developing metropolises, and for developing other socio-economic indicators.

Highlights

  • Measuring income1 distribution at a high spatial resolution is critical but challenging, even for developedThe associate editor coordinating the review of this manuscript and approving it for publication was Jinjia Zhou . 1According to the definition of American Community Survey, ‘‘Total income’’ refers to the sum of incomes reported separately for wage or salary income; net self-employment income; interest, dividends, or net rental or royalty income, or income from estates and trusts; Social Security or Railroad Retirement Income; Supplemental Security Income (SSI); public assistance or welfare payments; retirement, survivor, or disability pensions; and all other incomes [3].economies [1]–[3]

  • NOVELTY Given such background, we propose the adoption of a transfer learning methodology for fine-grained per-capita income and median household income estimation in developed economies, which outperforms state-of-the-art models and achieves a higher estimation accuracy at a district-level of a city

  • In Part 1, we develop a Siamese-like Convolutional Neural Network (CNN) to extract house price-related features from the daytime satellite images and the street views collected from New York City (NYC)

Read more

Summary

INTRODUCTION

Measuring income distribution at a high spatial resolution is critical but challenging, even for developed. Regarding district-level income estimation, previous studies have yet combined daytime satellite image with street view as a model input. NOVELTY Given such background, we propose the adoption of a transfer learning methodology for fine-grained per-capita income and median household income estimation in developed economies, which outperforms state-of-the-art models and achieves a higher estimation accuracy at a district-level of a city. Our proposed method combines four data categories, including house price, daytime satellite image, street view, and spatial information (latitude and longitude of district centroid) as data inputs. In Part 1, we develop a Siamese-like CNN to extract house price-related features from the daytime satellite images and the street views collected from NYC. We develop a novel Siamese-like CNN for extracting house price-related image features for fine-grained income estimation. Comparison of The Mixed Siamese GP Model with Models of Alternative Data Inputs and Architectures (Five-fold Validation R2, RMSE, MAE)

RESULT
CROSS-DISTRICT VALIDATION
Findings
CONCLUSION AND FUTURE RESEARCH
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call