Gridded Population Datasets Research Articles

BackgroundConducting surveys in low- and middle-income countries is often challenging because many areas lack a complete sampling frame, have outdated census information, or have limited data available for designing and selecting a representative sample. Geosampling is a probability-based, gridded population sampling method that addresses some of these issues by using geographic information system (GIS) tools to create logistically manageable area units for sampling. GIS grid cells are overlaid to partition a country’s existing administrative boundaries into area units that vary in size from 50 m × 50 m to 150 m × 150 m. To avoid sending interviewers to unoccupied areas, researchers manually classify grid cells as “residential” or “nonresidential” through visual inspection of aerial images. “Nonresidential” units are then excluded from sampling and data collection. This process of manually classifying sampling units has drawbacks since it is labor intensive, prone to human error, and creates the need for simplifying assumptions during calculation of design-based sampling weights. In this paper, we discuss the development of a deep learning classification model to predict whether aerial images are residential or nonresidential, thus reducing manual labor and eliminating the need for simplifying assumptions.ResultsOn our test sets, the model performs comparable to a human-level baseline in both Nigeria (94.5% accuracy) and Guatemala (96.4% accuracy), and outperforms baseline machine learning models trained on crowdsourced or remote-sensed geospatial features. Additionally, our findings suggest that this approach can work well in new areas with relatively modest amounts of training data.ConclusionsGridded population sampling methods like geosampling are becoming increasingly popular in countries with outdated or inaccurate census data because of their timeliness, flexibility, and cost. Using deep learning models directly on satellite images, we provide a novel method for sample frame construction that identifies residential gridded aerial units. In cases where manual classification of satellite images is used to (1) correct for errors in gridded population data sets or (2) classify grids where population estimates are unavailable, this methodology can help reduce annotation burden with comparable quality to human analysts.

BackgroundThe spatial modeling of infectious disease distributions and dynamics is increasingly being undertaken for health services planning and disease control monitoring, implementation, and evaluation. Where risks are heterogeneous in space or dependent on person-to-person transmission, spatial data on human population distributions are required to estimate infectious disease risks, burdens, and dynamics. Several different modeled human population distribution datasets are available and widely used, but the disparities among them and the implications for enumerating disease burdens and populations at risk have not been considered systematically. Here, we quantify some of these effects using global estimates of populations at risk (PAR) of P. falciparum malaria as an example.MethodsThe recent construction of a global map of P. falciparum malaria endemicity enabled the testing of different gridded population datasets for providing estimates of PAR by endemicity class. The estimated population numbers within each class were calculated for each country using four different global gridded human population datasets: GRUMP (~1 km spatial resolution), LandScan (~1 km), UNEP Global Population Databases (~5 km), and GPW3 (~5 km). More detailed assessments of PAR variation and accuracy were conducted for three African countries where census data were available at a higher administrative-unit level than used by any of the four gridded population datasets.ResultsThe estimates of PAR based on the datasets varied by more than 10 million people for some countries, even accounting for the fact that estimates of population totals made by different agencies are used to correct national totals in these datasets and can vary by more than 5% for many low-income countries. In many cases, these variations in PAR estimates comprised more than 10% of the total national population. The detailed country-level assessments suggested that none of the datasets was consistently more accurate than the others in estimating PAR. The sizes of such differences among modeled human populations were related to variations in the methods, input resolution, and date of the census data underlying each dataset. Data quality varied from country to country within the spatial population datasets.ConclusionsDetailed, highly spatially resolved human population data are an essential resource for planning health service delivery for disease control, for the spatial modeling of epidemics, and for decision-making processes related to public health. However, our results highlight that for the low-income regions of the world where disease burden is greatest, existing datasets display substantial variations in estimated population distributions, resulting in uncertainty in disease assessments that utilize them. Increased efforts are required to gather contemporary and spatially detailed demographic data to reduce this uncertainty, particularly in Africa, and to develop population distribution modeling methods that match the rigor, sophistication, and ability to handle uncertainty of contemporary disease mapping and spread modeling. In the meantime, studies that utilize a particular spatial population dataset need to acknowledge the uncertainties inherent within them and consider how the methods and data that comprise each will affect conclusions.

Gridded Population Datasets Research Articles

Articles published on Gridded Population Datasets

The impact of seasonality on multi-scale feature extraction techniques

Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia

Dasymetric mapping of urban population in China based on radiance corrected DMSP-OLS nighttime light and land cover data

Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery

Accuracy Assessment of Multi-Source Gridded Population Distribution Datasets in China

High-resolution reconstruction of the United States human population distribution, 1790 to 2010

GridSample: an R package to generate household survey primary sampling units (PSUs) from gridded population data

Research on Grid Size Suitability of Gridded Population Distribution in Urban Area: A Case Study in Urban Area of Xuanzhou District, China.

High-resolution African population projections from radiative forcing and socio-economic models, 2000 to 2100

Modelling changing population distributions: an example of the Kenyan Coast, 1979–2009

Improving Large Area Population Mapping Using Geotweet Densities.

Taking Advantage of the Improved Availability of Census Data: A First Look at the Gridded Population of the World, Version 4

Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data.

Exploring nationally and regionally defined models for large area population mapping

Generation of fine-scale population layers using multi-resolution satellite imagery and geospatial data

Large-scale spatial population databases in infectious disease research

The effects of spatial population dataset choice on estimates of population at risk of disease

Population detection profiles of DMSP-OLS night-time imagery by regions of the world

Assessing the use of global land cover data for guiding large area population distribution modelling

A high resolution spatial population database of Somalia for disease risk mapping

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Gridded Population Datasets Research Articles

Articles published on Gridded Population Datasets

The impact of seasonality on multi-scale feature extraction techniques

Linking Synthetic Populations to Household Geolocations: A Demonstration in Namibia

Dasymetric mapping of urban population in China based on radiance corrected DMSP-OLS nighttime light and land cover data

Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery

Accuracy Assessment of Multi-Source Gridded Population Distribution Datasets in China

High-resolution reconstruction of the United States human population distribution, 1790 to 2010

GridSample: an R package to generate household survey primary sampling units (PSUs) from gridded population data

Research on Grid Size Suitability of Gridded Population Distribution in Urban Area: A Case Study in Urban Area of Xuanzhou District, China.

High-resolution African population projections from radiative forcing and socio-economic models, 2000 to 2100

Modelling changing population distributions: an example of the Kenyan Coast, 1979–2009

Improving Large Area Population Mapping Using Geotweet Densities.

Taking Advantage of the Improved Availability of Census Data: A First Look at the Gridded Population of the World, Version 4

Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data.

Exploring nationally and regionally defined models for large area population mapping

Generation of fine-scale population layers using multi-resolution satellite imagery and geospatial data

Large-scale spatial population databases in infectious disease research

The effects of spatial population dataset choice on estimates of population at risk of disease

Population detection profiles of DMSP-OLS night-time imagery by regions of the world

Assessing the use of global land cover data for guiding large area population distribution modelling

A high resolution spatial population database of Somalia for disease risk mapping