Nitrogen dioxide (NO2) is an important pollutant related to human activities, which has short-term and long-term effects on human health. An ensemble learning model was constructed and applied to estimate daily NO2 concentrations in the Beijing–Tianjin–Hebei region between 2010 and 2016. A variety of predictive variables included satellite-based troposphere NO2 vertical column concentration, meteorology, elevation, gross domestic product (GDP), population, land-use variables, and road network. The ensemble learning model achieved two things: a 0.01° × 0.01° grid resolution and the estimation of historical data for the years 2010–2013. The ensemble model showed good performance, whereby the R2 of tenfold cross-validation was 0.72 and the R2 of test validation was 0.71. Meteorological hysteretic effects were incorporated into the model, where the one-day lagged boundary layer height contributed the most. The annual NO2 estimation showed little change from 2010 to 2016. The seasonal NO2 estimation from highest to lowest occurred in winter, autumn, spring, and summer. In the annual maps and seasonal maps, the NO2 estimations in the northwest region were lower than those in the southeast region, and there was a heavily polluted band in the south of the Taihang Mountains. In coastal areas, the annual NO2 estimations were higher than the NO2 monitored values. The drawback of the model is underestimation at high values and overestimation at low values. This study indicates that the ensemble learning model has excellent performance in the simulation of NO2 with high spatial and temporal resolution. Furthermore, the research framework in this study can be a generally applied for drawing implications for other regions, especially for other cities in China.
Read full abstract