PM2.5 Concentration Prediction Based on Spatiotemporal Feature Selection Using XGBoost-MSCNN-GA-LSTM

Hongbin Dai,Huibin Zeng,Fan Yang,Guangqiu Huang

doi:10.3390/su132112071

Hongbin Dai, Huibin Zeng + Show 2 more

Open Access

https://doi.org/10.3390/su132112071

Copy DOI

Abstract

With the rapid development of China’s industrialization, air pollution is becoming more and more serious. Predicting air quality is essential for identifying further preventive measures to avoid negative impacts. The existing prediction of atmospheric pollutant concentration ignores the problem of feature redundancy and spatio-temporal characteristics; the accuracy of the model is not high, the mobility of it is not strong. Therefore, firstly, extreme gradient lifting (XGBoost) is applied to extract features from PM2.5, then one-dimensional multi-scale convolution kernel (MSCNN) is used to extract local temporal and spatial feature relations from air quality data, and linear splicing and fusion is carried out to obtain the spatio-temporal feature relationship of multi-features. Finally, XGBoost and MSCNN combine the advantages of LSTM in dealing with time series. Genetic algorithm (GA) is applied to optimize the parameter set of long-term and short-term memory network (LSTM) network. The spatio-temporal relationship of multi-features is input into LSTM network, and then the long-term feature dependence of multi-feature selection is output to predict PM2.5 concentration. A XGBoost-MSCGL of PM2.5 concentration prediction model based on spatio-temporal feature selection is established. The data set comes from the hourly concentration data of six kinds of atmospheric pollutants and meteorological data in Fen-Wei Plain in 2020. To verify the effectiveness of the model, the XGBoost-MSCGL model is compared with the benchmark models such as multilayer perceptron (MLP), CNN, LSTM, XGBoost, CNN-LSTM with before and after using XGBoost feature selection. According to the forecast results of 12 cities, compared with the single model, the root mean square error (RMSE) decreased by about 39.07%, the average MAE decreased by about 42.18%, the average MAE decreased by about 49.33%, but R2 increased by 23.7%. Compared with the model after feature selection, the root mean square error (RMSE) decreased by an average of about 15%. On average, the MAPE decreased by 16%, the MAE decreased by 21%, and R2 increased by 2.6%. The experimental results show that the XGBoost-MSCGL prediction model offer a more comprehensive understanding, runs deeper levels, guarantees a higher prediction accuracy, and ensures a better generalization ability in the prediction of PM2.5 concentration.

Highlights

The multilayer perceptron (MLP) model is similar to the long-term and short-term memory network (LSTM) model in that the predicted values deviate greatly from the measured values when the measured values increase or decrease sharply
XGBoost-MLP, XGBoost-LSTM, XGBoost-convolution neural network (CNN), XGBoost-MSCGL with CNN, LSTM, MLP, and CNN-LSTM, we found that the predicted value of the model after feature selection is closer to the measured value than that before feature selection, with a greater increase in accuracy, and a marked decrease in derivation value
Using four deep learning combination models for training and validating the prediction accuracy, the results show that XGBoost-MSCGL has the highest prediction accuracy for most city training sets, and its prediction performance is better than other models

Summary

Introduction

With the increasing of environmental pollution, the weather issue of haze is spreading in China’s major cities. PM2.5 has become a major problem of air pollution. Recent studies have shown that PM2.5 leads to the occurrence of respiratory diseases, immune diseases, cardiovascular and cerebrovascular diseases and tumors [1,2]. Accurate prediction and early warnings of the concentration of PM2.5 are of great significance. Many scholars have begun to integrate multiple data features, but too many data and factor features will affect the prediction effect, and redundant features will affect the performance of model prediction. Many scholars have begun to use feature selection to make predictions. For example: In power system, cooperative search algorithm is used to select

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sustainability	Publication Date: Nov 1, 2021
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

PM2.5 Concentration Prediction Based on Spatiotemporal Feature Selection Using XGBoost-MSCNN-GA-LSTM

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sustainability

Lead the way for us

Similar Papers

Experimental and predicted dual oximetry variability.
P Weir ... S J Barker
Journal of clinical monitoring | VOL. 9
P Weir, et. al.P Weir ... S J Barker
01 Sep 1993
Journal of clinical monitoring | VOL. 9

Big Data Analytics Using Swarm-Based Long Short-Term Memory for Temperature Forecasting
Malini M Patil ... P M Rekha
Computers, Materials & Continua | VOL. 71
Malini M Patil, et. al.Malini M Patil ... P M Rekha
01 Jan 2021
Computers, Materials & Continua | VOL. 71

Defining, Comparing, and Improving iTRAQ Quantification in Mass Spectrometry Proteomics Data
Lina Hultin-Rosenberg ... Henrik J Johansson
Molecular & Cellular Proteomics | VOL. 12
Lina Hultin-Rosenberg, et. al.Lina Hultin-Rosenberg ... Henrik J Johansson
01 Jul 2013
Molecular & Cellular Proteomics | VOL. 12

Using Surveying and Computer Techniques to Calculate (R.A) & (RMSE) for Digital map of Technical Institute/Mosul
Mohammed Al–Taee
Iraqi National Journal of Earth Science (INJES) | VOL. 19
Mohammed Al–TaeeMohammed Al–Taee
30 Dec 2020
Iraqi National Journal of Earth Science (INJES) | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PM2.5 Concentration Prediction Based on Spatiotemporal Feature Selection Using XGBoost-MSCNN-GA-LSTM

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sustainability