Abstract

Spatial population distribution data is the discretization of demographic data into spatial grids, which has vital reference significance for disaster emergency response, disaster assessment, emergency rescue resource allocation, and post-disaster reconstruction. The random forest (RF) model, as a prominent method for modeling the spatial distribution of population, has been studied by many scholars, both domestically and abroad. Specifically, research has focused on aspects such as multi-source data fusion, feature selection, and data accuracy evaluation within the modeling process. However, discussions about parameter optimization methods during the modeling process and the impact of different optimization methods on modeling accuracy are relatively limited. In light of the above circumstances, this paper employs the RF model to conduct research on population spatialization with multi-source spatial information data. The study primarily explores the differences in model parameter optimization achieved through random search algorithms, grid search algorithms, genetic algorithms, simulated annealing algorithms, Bayesian optimization based on Gaussian process algorithms, and Bayesian optimization based on gradient boosting regression tree algorithms. Additionally, the study investigates the influence of different optimization algorithms on the accuracy of population spatialization modeling. Subsequently, the model with the highest accuracy is selected as the prediction model for population spatialization. Based on this model, a spatial population distribution dataset of Sichuan Province at a 1 km resolution is generated. Finally, the population dataset created in this paper is compared and validated with open datasets such as GPW, LandScan, and WorldPop. Experimental results indicate that the spatial population distribution dataset produced by the Bayesian optimization-based random forest model proposed in this paper exhibits a higher fitting accuracy with real data. The Coefficient of Determination (R2) is 0.6628, the Mean Absolute Error (MAE) is 12,459, and the Root Mean Squared Error (RMSE) is 25,037. Compared to publicly available international datasets, the dataset generated in this paper more accurately represents the spatial distribution of the population.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.