Abstract

Abstract. We developed a two-stage model called the random-forest–generalised additive model (RF–GAM), based on satellite data, meteorological factors, and other geographical covariates, to predict the surface 8 h O3 concentrations across the remote Tibetan Plateau. The 10-fold cross-validation result suggested that RF–GAM showed excellent performance, with the highest R2 value (0.76) and lowest root-mean-square error (RMSE) (14.41 µg m−3), compared with other seven machine-learning models. The predictive performance of RF–GAM showed significant seasonal discrepancy, with the highest R2 value observed in summer (0.74), followed by winter (0.69) and autumn (0.67), and the lowest one in spring (0.64). Additionally, the unlearning ground-observed O3 data collected from open-access websites were applied to test the transferring ability of the novel model and confirmed that the model was robust in predicting the surface 8 h O3 concentration during other periods (R2=0.67, RMSE = 25.68 µg m−3). RF–GAM was then used to predict the daily 8 h O3 level over the Tibetan Plateau during 2005–2018 for the first time. It was found that the estimated O3 concentration displayed a slow increase, from 64.74±8.30 µg m−3 to 66.45±8.67 µg m−3 from 2005 to 2015, whereas it decreased from the peak to 65.87±8.52 µg m−3 during 2015–2018. Besides this, the estimated 8 h O3 concentrations exhibited notable spatial variation, with the highest values in some cities of the northern Tibetan Plateau, such as Huangnan (73.48±4.53 µg m−3) and Hainan (72.24±5.34 µg m−3), followed by the cities in the central region, including Lhasa (65.99±7.24 µg m−3) and Shigatse (65.15±6.14 µg m−3), and the lowest O3 concentration occurred in a city of the southeastern Tibetan Plateau called Aba (55.17±12.77 µg m−3). Based on the 8 h O3 critical value (100 µg m−3) provided by the World Health Organization (WHO), we further estimated the annual mean nonattainment days over the Tibetan Plateau. It should be noted that most of the cities on the Tibetan Plateau had excellent air quality, while several cities (e.g. Huangnan, Haidong, and Guoluo) still suffered from more than 40 nonattainment days each year, which should be given more attention in order to alleviate local O3 pollution. The results shown herein confirm that the novel hybrid model improves the prediction accuracy and can be applied to assess the potential health risk, particularly in remote regions with few monitoring sites.

Highlights

  • Along with the rapid economic development and urbanisation, the anthropogenic emissions of nitrogen oxides (NOx) and volatile organic compounds (VOCs) displayed highspeed growth

  • The spatial distribution pattern modelled by the random-forest–generalised additive model (RF–generalised additive model (GAM)) model showed a similar characteristic to the results found by previous studies except on the northern Tibetan Plateau (Liu et al, 2018)

  • We developed a novel hybrid model (RF–GAM) based on multiple explanatory variables to estimate the surface 8 h O3 concentration across the remote Tibetan Plateau

Read more

Summary

Introduction

Along with the rapid economic development and urbanisation, the anthropogenic emissions of nitrogen oxides (NOx) and volatile organic compounds (VOCs) displayed highspeed growth. Later on, Wang et al (2016) developed a hybrid model called land use regression (LUR) coupled with CTMs to predict the surface O3 concentration in the Los Angeles Basin, California In recent years, these methods were applied to estimate the surface O3 level over China. The few monitoring sites on the Tibetan Plateau cannot capture the real O3 pollution status, especially in remote areas (e.g. the northern part of the Tibetan Plateau), because each site only possessed limited spatial representativeness Apart from these field measurements, Liu et al (2018) (R = 0.60) and Zhan et al (2018) (R2 = 0.66) used CTMs and the machinelearning model to simulate the surface O3 concentration over China in 2015, respectively. Filling the gap of statistical estimation of the 8 h O3 level in a remote region, this study provides useful datasets for epidemiological studies and air quality management

Study area
Ground-level 8 h O3 concentration
Satellite-retrieved O3 column amount
Meteorological data and geographical covariates
Model development and assessment
The validation of model performance
Variable importance
The nonattainment days over the Tibetan Plateau during 2005–2018
Summary and implications
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call