Abstract

ABSTRACT This study employed air quality and meteorological data as research materials and extracted the optimal feature subset by using the approximate Markov blanket-based normal maximum relevance minimum redundancy (nMRMR) algorithm to serve as the input data of the prediction model. In addition, a hybrid kernel (HK) was created to improve upon the traditional support vector regression (SVR) model. Particle swarm optimization (PSO) was used to calculate the optimal parameters of hybrid kernel (HK) SVR, which were then used to establish the nMRMR-PSO-HK-SVR model for PM2.5 concentration prediction. The 2016–2019 year air quality and weather data of Wuhan and Tianjin were employed to test the proposed method. The experimental results show that the mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE) and Theil’s inequality coefficient (TIC) of nMRMR-PSO-HK-SVR model are lower than those of SVR, PSO-SVR, nMRMR-SVR and PSO-HK-SVR model. But also, the proposed model could more precisely track moments of sudden PM2.5 concentration change. Thus, the nMRMR-PSO-HK-SVR model has more satisfactory generalizability and can predict PM2.5 concentration more precisely.

Highlights

  • Rapid development of economies worldwide has caused increasingly severe air pollution

  • Kim et al (2010) used the partial least squares (PLS) method to select the variables that have a greater impact on the output to predict PM2.5 and PM10 in the subway station, and compared with the prediction results obtained by taking all the measured variables as inputs, which proved the necessity of selecting characteristic variables

  • The optimal feature subset was used as the input, and the mixed kernel function support vector regression model was used to predict the PM2.5 concentration in the 24 h

Read more

Summary

Introduction

Rapid development of economies worldwide has caused increasingly severe air pollution. Sun and Sun (2017) combined principal component analysis (PCA) with least-squares SVR to predict the daily PM2.5 concentration; their experimental results revealed that the prediction precision was high. SVR models are based on statistical learning theories (Zhang, 2019); structural minimization is adopted as a principle, and the problem of over fitting does not exist. As a major type of air pollutant, PM2.5 has complex origins and forms through a complicated process under the influence of numerous factors (Ni et al, 2017; Song et al, 2018). It exhibits high complexity and nonlinearity (Wang et al, 2017b). Singh and Gupta (2012) used stepwise linear regression method to select the original features in the prediction of urban air quality, and used linear and nonlinear prediction models for experimental comparison. Kim et al (2010) used the partial least squares (PLS) method to select the variables that have a greater impact on the output to predict PM2.5 and PM10 in the subway station, and compared with the prediction results obtained by taking all the measured variables as inputs, which proved the necessity of selecting characteristic variables

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call