Abstract

Urbanization poses significant challenges on sustainable development, disaster resilience, climate change mitigation, and environmental and resource management. Accurate urban extent datasets at large spatial scales are essential for researchers and policymakers to better understand urbanization dynamics and its socioeconomic drivers and impacts. While high-resolution urban extent data products - including the Global Human Settlements Layer (GHSL), the Global Man-Made Impervious Surface (GMIS), the Global Human Built-Up and Settlement Extent (HBASE), and the Global Urban Footprint (GUF) - have recently become available, intermediate-resolution urban extent data products including the 1 km SEDAC’s Global Rural-Urban Mapping Project (GRUMP), MODIS 1km, and MODIS 500 m still have many users and have been demonstrated in a recent study to be more appropriate in urbanization process analysis (around 500 m resolution) than those at higher resolutions (30 m). The objective of this study is to improve large-scale urban extent mapping at an intermediate resolution (500 m) using machine learning methods through combining the complementary nighttime Visible Infrared Imaging Radiometer Suite (VIIRS) and daytime Moderate Resolution Imaging Spectroradiometer (MODIS) data, taking the conterminous United States (CONUS) as the study area. The effectiveness of commonly-used machine learning methods, including random forest (RF), gradient boosting machine (GBM), neural network (NN), and their ensemble (ESB), has been explored. Our results show that these machine learning methods can achieve similar high accuracies across all accuracy metrics (>95% overall accuracy, >98% producer’s accuracy, and >92% user’s accuracy) with Kappa coefficients greater than 0.90, which have not been achieved in the existing data products or by previous studies; the ESB is not able to produce significantly better accuracies than individual machine learning methods; the total misclassifications generated by GBM are more than those generated by RF, NN, and ESB by 14%, 16%, and 11%, respectively, with NN having the least total misclassifications. This indicates that using these machine learning methods, especially NN and RF, with the combination of VIIRS nighttime light and MODIS daytime normalized difference vegetation index (NDVI) data, high accuracy intermediate-resolution urban extent data products at large spatial scales can be achieved. The methodology has the potential to be applied to annual continental-to-global scale urban extent mapping at intermediate resolutions.

Highlights

  • Over 50% of the global population already lives in urban areas, and two-thirds of them are expected to live in urban areas by 2050 [1]

  • The objective of this study is to explore the effectiveness of these machine learning methods for improving the accuracies of large-scale urban extent mapping at intermediate resolutions (500 m) based on the combination of the complementary Visible Infrared Imaging Radiometer Suite (VIIRS) nighttime light and Moderate Resolution Imaging Spectroradiometer (MODIS) daytime normalized difference vegetation index (NDVI) data

  • To explore and compare the effectiveness of random forest (RF), gradient boosting machine (GBM), neural network (NN), and ESB in mapping urban extent, exactly the same datasets were used as inputs, which include VIIRS nighttime light luminosity annual composite, MODIS NDVI annual composite, and the training reference samples

Read more

Summary

Introduction

Over 50% of the global population already lives in urban areas, and two-thirds of them are expected to live in urban areas by 2050 [1]. The reasons include (1) intermediate-resolution satellite images have proven effective in urban extent extraction at regional to global scales [10,16,19,20] and will be more computationally efficient, (2) intermediate-resolution urban extent data products generated from satellite data such as the 1 km NASA Socioeconomic Data and Applications Center (SEDAC)’s Global Rural-Urban Mapping Project (GRUMP), MODIS 1km, and MODIS 500 m [10,21,22] still attract many analyses and modeling users [23,24,25,26,27], (3) considering that urban and rural areas are not necessarily discrete classes but more of a continuum [28,29], intermediate-resolution data products may better reflect demographic and sociological conditions of urban areas [12,15], which include not just built-up areas but the urban fabric of core urban areas and surrounding hinterlands and commuter-sheds, (4) broader definitions of what constitutes urban areas are useful for studies of urban morphology, energy use, climate change, and sustainability [14], and for research on rural agricultural systems where one may wish to exclude all but the smallest built-up areas, and (5) a recently published study has demonstrated that urban extent data products at 480 m resolution are more appropriate than those at the high resolution (30 m) for urbanization process analysis at large spatial scales [30]

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call