Global gridded multi-temporal datasets to support human population distribution modelling
Population distributions across countries and regions exhibit significant spatial and temporal variability. This variation highlights the need for high-resolution, small-area demographic data to address the challenges posed by shifting population dynamics, urbanization, and migration. Small area population modelling, particularly the production of gridded population estimates, has advanced rapidly over the past decade. Gridded population estimates rely heavily on the availability of detailed geospatial ancillary datasets to capture, inform and explain the variabilities in population densities and distributions at small area scales, enabling the disaggregation from areal unit-based counts. Here we describe an extensive geospatial collection of annual, high resolution, spatio-temporally harmonised, global datasets aimed at driving improvements in mapping small area population density variation. This article presents the spatio-temporal harmonisation process that results in an open access repository of 73 individual gridded datasets addressing topography, climate, nighttime lights, land cover, inland water, infrastructure, protected areas as well as the built-up environment on a global level at a spatial resolution of 3 arc-seconds (approximately 100 metres). Datasets are available as annual time series from 2015 up to and including at least 2020, and as recent as 2023 where source datasets allow. Such datasets not only support population modelling but also applications across environmental, economic, and health sectors, supporting informed policy-making and resource allocation for sustainable development.
1032
- 10.1016/j.progress.2011.04.001
- Feb 1, 2011
- Progress in Planning
115
- 10.1007/s10113-017-1224-3
- Oct 3, 2017
- Regional environmental change
14
- 10.1016/j.healthplace.2018.06.006
- Jun 26, 2018
- Health & Place
7
- 10.1016/j.ssaho.2020.100102
- Jan 1, 2021
- Social Sciences & Humanities Open
245
- 10.1177/0969776417694680
- Mar 19, 2017
- European Urban and Regional Studies
839
- 10.1371/journal.pone.0107042
- Feb 17, 2015
- PLoS ONE
20
- 10.1145/2792838.2796555
- Sep 16, 2015
15
- 10.1038/s43247-023-00758-w
- Mar 29, 2023
- Communications Earth & Environment
14
- 10.1371/journal.pone.0224742
- Oct 29, 2019
- PLoS ONE
320
- 10.1029/2022rg000777
- Jan 6, 2023
- Reviews of Geophysics
- Research Article
46
- 10.1007/s11111-020-00360-8
- Sep 1, 2020
- Population and Environment
Human activity is a major driver of change and has contributed to many of the challenges we face today. Detailed information about human population distribution is fundamental and use of freely available, high-resolution, gridded datasets on global population as a source of such information is increasing. However, there is little research to guide users in dataset choice. This study evaluates five of the most commonly used global gridded population datasets against a high-resolution Swedish population dataset on a pixel level. We show that datasets which employ more complex modeling techniques exhibit lower errors overall but no one dataset performs best under all situations. Furthermore, differences exist in how unpopulated areas are identified and changes in algorithms over time affect accuracy. Our results provide guidance in navigating the differences between the most commonly used gridded population datasets and will help researchers and policy makers identify the most suitable datasets under varying conditions.
- Research Article
50
- 10.1016/j.quaint.2014.12.027
- Jan 3, 2015
- Quaternary International
Comparison of monthly precipitation derived from high-resolution gridded datasets in arid Xinjiang, central Asia
- Research Article
113
- 10.1002/joc.5510
- Apr 1, 2018
- International Journal of Climatology
The accuracies of gridded precipitation data sets are important for regional climate studies and hydrological models. In this study, the performances of Global Precipitation Climatology Centre (GPCC) V7, Climatic Research Unit (CRU) TS 3.22 and Willmott and Matsuura (WM) precipitation data sets were examined over central Asia by comparing them against observed precipitation records (OBS) from 586 meteorological stations during 1901–2010. The results show that all the three gridded data sets underestimated the observed precipitation at annual and monthly scales, especially in mountainous areas. Both GPCC and WM underestimated seasonal precipitation, especially for spring precipitation. Among the three gridded data sets, GPCC had the highest correlation and lowest bias compared with CRU and WM when against the OBS. WM had a higher correlation than that of CRU, and its bias was larger than that of CRU. In terms of the drought and heavy rainfall events, CRU had the best performance in capturing drought events, and GPCC was best at representing heavy rainfall events. These differences in the performances between the three gridded data sets were primarily induced by their different interpolation methods and the numbers of available meteorological stations used in the interpolations of the three gridded data sets. Therefore, compared to the other two data sets, GPCC is more suitable for studies of long‐term precipitation variations over central Asia.
- Research Article
183
- 10.1086/382255
- Apr 1, 2004
- Current Anthropology
Continental Physiography, Climate, and the Global Distribution of Human Population
- Research Article
56
- 10.3390/rs9080797
- Aug 2, 2017
- Remote Sensing
Nighttime light data can characterize urbanization, economic development, population density, energy consumption and other human activities. Additionally, carbon dioxide (CO2) emissions are closely related to the scope and intensity of human activities. In this study, we assess the utility of nighttime light data as a powerful tool to reflect CO2 emissions from energy consumption, analyze the uncertainty associated with different nighttime light data for modeling CO2 emissions, and provide guidance and a reference for modeling CO2 emissions based on nighttime light data. In this paper, Mainland China was taken as a case study, and nighttime light datasets (the Defense Meteorological Satellite Program’s Operational Linescan System (DMSP-OLS) nighttime light data and the Suomi National Polar-Orbiting Partnership Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) nighttime light data) as well as a global gridded CO2 emissions dataset (PKU-CO2) were used to perform simple regressions at provincial, prefectural and 0.1° × 0.1° grid levels, respectively. The analyses are aimed at exploring the accuracy and uncertainty of DMSP-OLS and NPP-VIIRS nighttime light data in modeling CO2 emissions at different spatial scales. The improvement of nighttime light index and the potential factors influencing the effects of modeling CO2 emissions based on nighttime light datasets were also explored. The results show that DMSP-OLS is superior to NPP-VIIRS in modeling CO2 emissions at all spatial scales, and the bigger the scale, the more evident the advantages of DMSP-OLS. When modeling CO2 emissions with nighttime light datasets, not only the total amount of lights within a given statistical unit but also the agglomeration degree of lights should be taken into account. Furthermore, the geographical location and socio-economic conditions at the study site, such as gross regional product per capita (GRP per capita), population, and urbanization were shown to have an impact on the regression effect of the nighttime lights-CO2 emissions model. The regression effect was found to be better at higher latitude and longitude areas with higher GRP per capita and higher urbanization, while population showed little effect on the regression effect of the nighttime lights - CO2 emissions model. The limitation of this study is that the thresholds of potential factors are unclear and the quantitative guidance is insufficient.
- Research Article
6
- 10.1080/01431161.2020.1841322
- Dec 30, 2020
- International Journal of Remote Sensing
Gridded population datasets are essential for displaying spatial distributions of residential populations. They are widely used in urban planning, decision-making, disaster assessment, and public health. However, the grid resolution may affect the accuracy of population distributions, and this issue should be further explored to obtain a clearer understanding. Therefore, it is crucial to determine appropriate grid sizes for ascertaining the spatial characteristics of population distributions on a large scale. The choice of the grid resolution for a population dataset generally depends on the source datasets and the requirements of a specific project. While previous studies on grid resolutions were conducted predominantly in small study areas, this study focused primarily on the population distribution of the whole of China at 14 different scales, from 100 m to 1 km (with a 100-m interval), and from 1 km to 5 km (with a 1-km interval). Population spatialization was conducted using census data from 351 cities in China at the city level and impervious surface data derived from satellite images. Dasymetric mapping method was employed to estimate the population distribution, and the scale effects of the population estimates were examined at different scales of impervious surface data. The results of an accuracy assessment of the population estimates using county-level census data demonstrated that the impervious surface data were useful and effective when estimating residential populations with dasymetric mapping. The scale effects had varying degrees of accuracy of the estimated populations derived at different scales of impervious surface data, and a scale of 2–4 km was deemed optimal for estimating the residential population distribution based on impervious surfaces while using the dasymetric mapping method.
- Research Article
69
- 10.1186/1478-7954-9-4
- Feb 7, 2011
- Population Health Metrics
BackgroundThe spatial modeling of infectious disease distributions and dynamics is increasingly being undertaken for health services planning and disease control monitoring, implementation, and evaluation. Where risks are heterogeneous in space or dependent on person-to-person transmission, spatial data on human population distributions are required to estimate infectious disease risks, burdens, and dynamics. Several different modeled human population distribution datasets are available and widely used, but the disparities among them and the implications for enumerating disease burdens and populations at risk have not been considered systematically. Here, we quantify some of these effects using global estimates of populations at risk (PAR) of P. falciparum malaria as an example.MethodsThe recent construction of a global map of P. falciparum malaria endemicity enabled the testing of different gridded population datasets for providing estimates of PAR by endemicity class. The estimated population numbers within each class were calculated for each country using four different global gridded human population datasets: GRUMP (~1 km spatial resolution), LandScan (~1 km), UNEP Global Population Databases (~5 km), and GPW3 (~5 km). More detailed assessments of PAR variation and accuracy were conducted for three African countries where census data were available at a higher administrative-unit level than used by any of the four gridded population datasets.ResultsThe estimates of PAR based on the datasets varied by more than 10 million people for some countries, even accounting for the fact that estimates of population totals made by different agencies are used to correct national totals in these datasets and can vary by more than 5% for many low-income countries. In many cases, these variations in PAR estimates comprised more than 10% of the total national population. The detailed country-level assessments suggested that none of the datasets was consistently more accurate than the others in estimating PAR. The sizes of such differences among modeled human populations were related to variations in the methods, input resolution, and date of the census data underlying each dataset. Data quality varied from country to country within the spatial population datasets.ConclusionsDetailed, highly spatially resolved human population data are an essential resource for planning health service delivery for disease control, for the spatial modeling of epidemics, and for decision-making processes related to public health. However, our results highlight that for the low-income regions of the world where disease burden is greatest, existing datasets display substantial variations in estimated population distributions, resulting in uncertainty in disease assessments that utilize them. Increased efforts are required to gather contemporary and spatially detailed demographic data to reduce this uncertainty, particularly in Africa, and to develop population distribution modeling methods that match the rigor, sophistication, and ability to handle uncertainty of contemporary disease mapping and spread modeling. In the meantime, studies that utilize a particular spatial population dataset need to acknowledge the uncertainties inherent within them and consider how the methods and data that comprise each will affect conclusions.
- Supplementary Content
92
- 10.1186/1476-072x-11-7
- Jan 1, 2012
- International Journal of Health Geographics
Modelling studies on the spatial distribution and spread of infectious diseases are becoming increasingly detailed and sophisticated, with global risk mapping and epidemic modelling studies now popular. Yet, in deriving populations at risk of disease estimates, these spatial models must rely on existing global and regional datasets on population distribution, which are often based on outdated and coarse resolution data. Moreover, a variety of different methods have been used to model population distribution at large spatial scales. In this review we describe the main global gridded population datasets that are freely available for health researchers and compare their construction methods, and highlight the uncertainties inherent in these population datasets. We review their application in past studies on disease risk and dynamics, and discuss how the choice of dataset can affect results. Moreover, we highlight how the lack of contemporary, detailed and reliable data on human population distribution in low income countries is proving a barrier to obtaining accurate large-scale estimates of population at risk and constructing reliable models of disease spread, and suggest research directions required to further reduce these barriers.
- Research Article
6
- 10.1016/j.jaridenv.2023.104963
- Feb 13, 2023
- Journal of Arid Environments
Evaluation of the accuracy of satellite-based rainfed wheat yield dataset over an area with complex geography
- Research Article
43
- 10.3390/ijgi10100681
- Oct 9, 2021
- ISPRS International Journal of Geo-Information
The release of global gridded population datasets, including the Gridded Population of the World (GPW), Global Human Settlement Population Grid (GHS-POP), WorldPop, and LandScan, have greatly facilitated cross-comparison for ongoing research related to anthropogenic impacts. However, little attention is paid to the consistency and discrepancy of these gridded products in the regions with rapid changes in local population, e.g., Mainland Southeast Asia (MSEA), where the countries have experienced fast population growth since the 1950s. This awkward situation is unsurprisingly aggravated because of national scarce demographics and incomplete census counts, which further limits their appropriate usage. Thus, comparative analyses of them become the priority of their better application. Here, the consistency and discrepancy of the four common global gridded population datasets were cross-compared by combing the 2015 provincial population statistics (census and yearbooks) via error-comparison based statistical methods. The results showed that: (1) the LandScan performs the best both in spatial accuracy and estimated errors, then followed by the WorldPop, GHS-POP, and GPW in MSEA. (2) Provincial differences in estimated errors indicated that the LandScan better reveals the spatial pattern of population density in Thailand and Vietnam, while the WorldPop performs slightly better in Myanmar and Laos, and both fit well in Cambodia. (3) Substantial errors among the four gridded datasets normally occur in the provincial units with larger population density (over 610 persons/km2) and a rapid population growth rate (greater than 1.54%), respectively. The new findings in MSEA indicated that future usage of these datasets should pay attention to the estimated population in the areas characterized by high population density and rapid population growth.
- Research Article
20
- 10.1017/jog.2021.28
- Apr 30, 2021
- Journal of Glaciology
Gridded glacier datasets are essential for various glaciological and climatological research because they link glacier cover with the corresponding gridded meteorological variables. However, there are significant differences between the gridded data and the shapefile data in the total area calculations in the Randolph Glacier Inventory (RGI) 6.0 at global and regional scales. Here, we present a new global gridded glacier dataset based on the RGI 6.0 that eliminates the differences. The dataset is made by dividing the glacier polygons using cell boundaries and then recalculating the area of each polygon in the cell. Our dataset (1) exhibits a good agreement with the RGI area values for those regions in which gridded areas showed a generally good consistency with those in the shapefile data, and (2) reduces the errors existing in the current RGI gridded dataset. All data and code used in this study are freely available and we provide two examples to demonstrate the application of this new gridded dataset.
- Research Article
115
- 10.1111/cobi.12462
- Feb 17, 2015
- Conservation Biology
The nighttime light environment of much of the earth has been transformed by the introduction of electric lighting. This impact continues to spread with growth in the human population and extent of urbanization. This has profound consequences for organismal physiology and behavior and affects abundances and distributions of species, community structure, and likely ecosystem functions and processes. Protected areas play key roles in buffering biodiversity from a wide range of anthropogenic pressures. We used a calibration of a global satellite data set of nighttime lights to determine how well they are fulfilling this role with regard to artificial nighttime lighting. Globally, areas that are protected tend to be darker at night than those that are not, and, with the exception of Europe, recent regional declines in the proportion of the area that is protected and remains dark have been small. However, much of these effects result from the major contribution to overall protected area coverage by the small proportion of individual protected areas that are very large. Thus, in Europe and North America high proportions of individual protected areas (>17%) have exhibited high levels of nighttime lighting in all recent years, and in several regions (Europe, Asia, South and Central America) high proportions of protected areas (32-42%) have had recent significant increases in nighttime lighting. Limiting and reversing the erosion of nighttime darkness in protected areas will require routine consideration of nighttime conditions when designating and establishing new protected areas; establishment of appropriate buffer zones around protected areas where lighting is prohibited; and landscape level reductions in artificial nighttime lighting, which is being called for in general to reduce energy use and economic costs.
- Research Article
1
- 10.1080/01431161.2023.2227319
- Jun 18, 2023
- International Journal of Remote Sensing
Artificial night-time light (NTL) emissions, collected by satellites, are a reliable and widely used remote proxy for urban extent. Recently, it was demonstrated that NTLs are stronger associated with the total volume of the buildings than with their total footprint area, a traditional urban extent measure. However, this finding may not be a general rule: We presume that both associations are sensitive to characteristics of the built-up environment, while the latter is known to be quite heterogeneous within and between cities. Moreover, we hypothesize that at least in specific built-up environments, NTL may even stronger correlate with another urban extent measure – the total lateral surface area of the buildings, due to its more accurate proportionality to the area of light-emitting windows. The present study uses NTLs and buildings datasets of 38 European capital cities and compares the associations of NTLs with the three aforementioned urban extent measures (the total footprint area, the total volume, and the total lateral surface area of the buildings) across different built-up environments. The results indicate that in urban areas characterized by a high density of tall buildings the strongest observed association is between NTLs and the building’s footprint area. In comparison, the total volume of buildings is better related to NTLs in urban areas that host compact settings of relatively small buildings. But there is a third type of physical urban configuration, characterized by large buildings sparsely distributed in space. In this type of urban area, the strongest observed association is between NTLs and lateral surface area. We conclude that monitoring urban growth using NTLs will benefit from a preliminary assessment of the local built-up characteristics.
- Research Article
27
- 10.5194/cp-7-527-2011
- May 20, 2011
- Climate of the Past
Abstract. The Central Netherlands Temperature (CNT) is a monthly daily mean temperature series constructed from homogenized time series from the centre of the Netherlands. The purpose of this series is to offer a homogeneous time series representative of a larger area in order to study large-scale temperature changes. It will also facilitate a comparison with climate models, which resolve similar scales. From 1906 onwards, temperature measurements in the Netherlands have been sufficiently standardized to construct a high-quality series. Long time series have been constructed by merging nearby stations and using the overlap to calibrate the differences. These long time series and a few time series of only a few decades in length have been subjected to a homogeneity analysis in which significant breaks and artificial trends have been corrected. Many of the detected breaks correspond to changes in the observations that are documented in the station metadata. This version of the CNT, to which we attach the version number 1.1, is constructed as the unweighted average of four stations (De Bilt, Winterswijk/Hupsel, Oudenbosch/Gilze-Rijen and Gemert/Volkel) with the stations Eindhoven and Deelen added from 1951 and 1958 onwards, respectively. The global gridded datasets used for detecting and attributing climate change are based on raw observational data. Although some homogeneity adjustments are made, these are not based on knowledge of local circumstances but only on statistical evidence. Despite this handicap, and the fact that these datasets use grid boxes that are far larger then the area associated with that of the Central Netherlands Temperature, the temperature interpolated to the CNT region shows a warming trend that is broadly consistent with the CNT trend in all of these datasets. The actual trends differ from the CNT trend up to 30 %, which highlights the need to base future global gridded temperature datasets on homogenized time series.
- Research Article
12
- 10.1007/s10109-019-00305-2
- Aug 2, 2019
- Journal of Geographical Systems
Characteristics of the urban environment influence where and when crime events occur; however, past studies often analyse cross-sectional data for one spatial scale and do not account for the processes and place-based policies that influence crime across multiple scales. This research applies a Bayesian cross-classified multilevel modelling approach to examine the spatiotemporal patterning of violent crime at the small-area, neighbourhood, electoral ward, and police patrol zone scales. Violent crime is measured at the small-area scale (lower-level units) and small areas are nested in neighbourhoods, electoral wards, and patrol zones (higher-level units). The cross-classified multilevel model accommodates multiple higher-level units that are non-hierarchical and have overlapping geographical boundaries. Results show that violent crime is positively associated with population size, residential instability, the central business district, and commercial, government-institutional, and recreational land uses within small areas and negatively associated with civic engagement within electoral wards. Combined, the three higher-level units explain approximately fifteen per cent of the total spatiotemporal variation of violent crime. Neighbourhoods are the most important source of variation among the higher-level units. This study advances understanding of the multiscale processes influencing spatiotemporal crime patterns and provides area-specific information within the geographical frameworks used by policymakers in urban planning, local government, and law enforcement.
- Research Article
- 10.12688/gatesopenres.16368.1
- Oct 27, 2025
- Gates Open Research
- Research Article
- 10.12688/gatesopenres.16369.1
- Oct 24, 2025
- Gates Open Research
- Research Article
- 10.12688/gatesopenres.16366.1
- Oct 9, 2025
- Gates Open Research
- Research Article
- 10.12688/gatesopenres.16367.1
- Oct 8, 2025
- Gates Open Research
- Discussion
- 10.12688/gatesopenres.16364.1
- Sep 15, 2025
- Gates Open Research
- Research Article
- 10.12688/gatesopenres.16363.1
- Sep 15, 2025
- Gates Open Research
- Research Article
- 10.12688/gatesopenres.16362.1
- Sep 9, 2025
- Gates Open Research
- Research Article
- 10.12688/gatesopenres.16359.1
- Aug 27, 2025
- Gates Open Research
- Research Article
- 10.12688/gatesopenres.16357.1
- Jul 28, 2025
- Gates open research
- Research Article
- 10.12688/gatesopenres.15399.2
- Jul 28, 2025
- Gates open research
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.