In this study, six individual machine learning (ML) models and a stacked ensemble model (SEM) were used for daytime visibility estimation at Bangkok airport during the dry season (November-April) for 2017-2022. The individual ML models are random forest, adaptive boosting, gradient boosting, extreme gradient boosting, light gradient boosting machine, and cat boosting. The SEM was developed by the combination of outputs from the individual models. Furthermore, the impact of factors affecting visibility was examined using the Shapley Additive exPlanation (SHAP) method, an interpretable ML technique inspired by the game theory-based approach. The predictor variables include different air pollutants, meteorological variables, and time-related variables. The light gradient boosting machine model is identified as the most effective individual ML model. On an hourly time scale, it showed the best performance across three out of four metrics with the ρ = 0.86, MB = 0, ME = 0.48km (second lowest), and RMSE = 0.8km. On a daily time scale, the model performed the best for all evaluation metrics with ρ = 0.92, MB = 0.0km, ME = 0.3km, and RMSE = 0.43km. The SEM outperformed all the individual models across three out of four metrics on an hourly time scale with ρ = 0.88, MB = 0.0km, (second lowest), and RMSE = 0.75km. On the daily scale, it performed the best with ρ = 0.93, MB = 0.02km, ME = 0.27km, and RMSE = 0.4km. The seasonal average original (VISorig) and meteorologically normalized visibility (VISnorm) decrease from 2017 to 2021 but increase in 2022. The rate of decrease in VISorig is double than rate of decrease in VISnorm which suggests the effect of meteorology visibility degradation. The SHAP analysis identified relative humidity (RH), PM2.5, PM10, day of the season year (i.e., Julian day) (JD), and O3 as the most important variables affecting visibility. At low RH, visibility is not sensitive to changes in RH. However, beyond a threshold, a negative correlation between RH and visibility is found potentially due to the hygroscopic growth of aerosols. The dependence of the Shapley values of PM2.5 and PM10 on RH and the change in average visibilities under different RH intervals also suggest the effect of hygroscopic growth of aerosol on visibility. A negative relationship has been identified between visibility and both PM2.5 and PM10. Visibility is positively correlated with O3 at lower to moderate concentrations, with diminishing impact at very high concentrations. The JD is strongly negatively related to visibility during winter while weakly associated positively later in summer. Findings from this research suggest the feasibility of employing machine learning techniques for predicting visibility and understanding the factors influencing its fluctuations.
Read full abstract