Abstract
Datasets collecting demographic and socio-economic statistics are widely available. Still, the data are often only released for highly aggregated geospatial areas, which can mask important local hotspots. When conducting spatial analysis, one often needs to disaggregate the source data, transforming the statistics reported for a set of source zones into values for a set of target zones, with a different geometry and a higher spatial resolution. This article reports on a novel dasymetric disaggregation method that uses encoder–decoder convolutional neural networks, similar to those adopted in image segmentation tasks, to combine different types of ancillary data. Model training constitutes a particular challenge. This is due to the fact that disaggregation tasks are ill-posed and do not entail the direct use of supervision signals in the form of training instances mapping low-resolution to high-resolution counts. We propose to address this problem through self-training. Our method iteratively refines initial estimates produced by disaggregation heuristics and training models with the estimates from previous iterations together with relevant regularization strategies. We conducted experiments related to the disaggregation of different variables collected for Continental Portugal into a raster grid with a resolution of 200 m. Results show that the proposed approach outperforms common alternative methods, including approaches that use other types of regression models to infer the dasymetric weights.
Highlights
Geospatial layers with statistical count data are widely available on a variety of subjects, such as economic activities or public health concerns
We first compare the application of the model, with an Root Mean Square Error (RMSE) loss and without any type of regularization strategy, against baseline disaggregation methods (Section 5.1)
The first results are reported for the variables related to the amount of withdrawals in three different scenarios, i.e., the overall amount for the entire year, the summer period, and the winter period
Summary
Geospatial layers with statistical count data are widely available on a variety of subjects, such as economic activities or public health concerns. These data are often collected or released at an aggregated level, with insufficient spatial detail for some applications (e.g., aggregated data for districts or municipalities, hiding variations between smaller administrative units). Thin-grained information can better enable the formulation of informed hypotheses in the context of demographic, social, or environmental issues It can improve the analysis of the data through different partitions of space or in terms of their relation to particular terrain characteristics (e.g., analysing relations towards highresolution data obtained through remote sensing or from volunteered geographic services)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.