Small data samples are still a critical challenge for spatial predictions. Land use regression (LUR) is a widely used model for spatial predictions with observations at a limited number of locations. Studies have demonstrated that LUR models can overcome the limitation exhibited by other spatial prediction models which usually require greater spatial densities of observations. However, the prediction accuracy and robustness of LUR models still need to be improved due to the linear regression within the LUR model. To improve LUR models, this study develops a land use quantile regression (LUQR) model for more accurate spatial predictions for small data samples. The LUQR is an integration of the LUR and quantile regression, which both have advantages in predictions with a small data set of samples. In this study, the LUQR model is applied in predicting spatial distributions of annual mean PM2.5concentrations across the Greater Sydney Region, New South Wales, Australia, with observations at 19 valid monitoring stations in 2020. Cross validation shows that the goodness-of-fit can be improved by 25.6–32.1% by LUQR models when compared with LUR, and prediction root mean squared error (RMSE) and mean absolute error (MAE) can be reduced by 10.6–13.4% and 19.4–24.7% by LUQR models, respectively. This study also indicates that LUQR is a more robust model for the spatial prediction with small data samples than LUR. Thus, LUQR has great potentials to be widely applied in spatial issues with a limited number of observations.
Read full abstract