Approximately half of satellite aerosol retrievals are missing that limits the application of satellite data in PM2.5 pollution monitoring. To obtain spatiotemporally continuous PM2.5 distributions, various gap-filling methods have been developed, but have rarely been evaluated. Here, we reviewed and summarized four types of gap-filling strategies, and applied them to a random forest PM2.5 prediction model that incorporated ground observations, chemical transport model (CTM) simulations, and satellite AOD for predicting daily PM2.5 concentrations at a 1-km resolution in 2013 in the Beijing-Tianjin-Hebei region and the Yangtze River Delta. The model out-of-bag predictions were compared with national station measurements and external measurements to assess the performance of different gap-filling methods. We also conducted a by-city cross-validation and characterized the spatial distributions of PM2.5 prediction when the AOD coverage was low. We found that the methods filling in missing data by regression, i.e. multiple imputation and decision tree, performed robustly to characterizing PM2.5 variation at a high spatial resolution and the method filling in missing PM2.5 predictions with decision tree overcame the problem of time-consuming computations. The method using spatiotemporal trends to fill in missing data, i.e. ordinary kriging and generalized additive mixed model, may be overrated in statistical evaluation tests, and predicted artificially oversmoothed PM2.5 spatial distributions. We also revealed that CTM simulations benefited the prediction of PM2.5 spatial distribution in all the models with various gap-filling strategies with higher prediction accuracy in the by-city cross-validation. We noticed that the PM2.5 prediction was not sensitive to the resolution of CTM simulations and even the 12-km resolution CTM simulations benefited the high-resolution PM2.5 prediction.