Abstract

PM2.5 is one of the main pollutants that cause air pollution, and high concentrations of PM2.5 seriously threaten human health. Therefore, an accurate prediction of PM2.5 concentration has great practical significance for air quality detection, air pollution restoration, and human health. This paper uses the historical air quality concentration data and meteorological data of the Beijing Olympic Sports Center as the research object. This paper establishes a long short-term memory (LSTM) model with a time window size of 12, establishes a T-shape light gradient boosting machine (TSLightGBM) model that uses all information in the time window as the next period of prediction input, and establishes a LSTM-TSLightGBM model pair based on an optimal weighted combination method. PM2.5 hourly concentration is predicted. The prediction results on the test set show that the mean squared error (MAE), root mean squared error (RMSE), and symmetric mean absolute percentage error (SMAPE) of the LSTM-TSLightGBM model are 11.873, 22.516, and 19.540%, respectively. Compared with LSTM, TSLightGBM, the recurrent neural network (RNN), and other models, the LSTM-TSLightGBM model has a lower MAE, RMSE, and SMAPE, and higher prediction accuracy for PM2.5 and better goodness-of-fit.

Highlights

  • In recent years, the process of industrialization and urbanization has accelerated, injecting vitality into the global economy, bringing people a better life but causing serious damage to the ecological environment

  • Liu et al [7] jointly applied the support vector machine (SVM) and particle swarm optimization (PSO) to establish a rolling forecast model, and the effect is better than a single radial basis neural network and multiple linear regression

  • PM2.5 per hour through historical variable information, PM2.5 concentration is used as the explained variable, and historical air particulate matter records and weather conditions are used as the explanatory variables

Read more

Summary

Introduction

The process of industrialization and urbanization has accelerated, injecting vitality into the global economy, bringing people a better life but causing serious damage to the ecological environment. The correlation coefficient, Spearman level, and mean square error of the LSTM model are better than those of the RNN This proves that it is an air quality prediction model with higher accuracy and stronger generalization effect. Liu et al [7] jointly applied the SVM and particle swarm optimization (PSO) to establish a rolling forecast model, and the effect is better than a single radial basis neural network and multiple linear regression. The RNN and LSTM of the force mechanism are more accurate in predicting the concentration of PM2.5 than those without this addition Most of these NN models give more attention to time series features. Taking into account the time series characteristics of the data and the nonlinear characteristics of the data, this article proposes that an integrated learning model be combined with LSTM to establish a short-term prediction model of PM2.5. This paper establishes a weighted combination model of LSTM and LightGBM through the optimal weighted combination method based on the residual of the verification set

LSTM Model
LightGBM Model
Goss Algorithm and EFB Algorithm
Histogram Algorithm
Building the Leafwise Strategy
An Optimal Weighted Combination Method
The Combined Forecasting Process
Experimental Environment
Data Source and Index Selection
Data Preprocessing
Data Set Partition and Normalization
Evaluating Indicator
Model Construction and Evaluation
The TSLightGBM Model
LSTM-TSLightGBM Weighted Combination Model
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call