A new hybrid model and its application for forecasting of daily PM2.5 concentrations
A new hybrid model and its application for forecasting of daily PM2.5 concentrations
- Research Article
- 10.3390/atmos16121317
- Nov 22, 2025
- Atmosphere
Numerous machine learning models have been widely used for the spatial prediction of PM2.5 mass concentrations in the field of remote sensing, but most studies rely on single models, limiting their ability to capture complex nonlinear relationships. Furthermore, traditional Aerosol Optical Depth (AOD) methods suffer from extensive missing values due to algorithmic limitations, hindering daily PM2.5 mass concentration retrieval. This study first developed a hybrid random forest and extreme gradient boosting model (RF-XGBoost) to overcome single-model accuracy constraints. Subsequently, Top-of-Atmosphere (TOA) reflectance replaced conventional AOD as the hybrid model’s input. Finally, we integrated four-year (2020–2023) TOA reflectance, normalized difference vegetation index (NDVI) data, meteorological data, digital elevation model (DEM) data, and day-of-year data to develop a high-precision hybrid model specifically optimized for Jiangxi Province. The simulation results demonstrated that the hybrid RF-XGBoost model (test-R2 = 0.82, RMSE = 7.25 μg/m3, MAE = 4.90 μg/m3) outperformed the single Random Forest Model by 25% and 26% in terms of the root mean square error (RMSE) and mean absolute error (MAE), respectively. The high predictive accuracy of our method confirms its effectiveness in generating reliable PM2.5 estimates. The resulting four-year dataset also successfully delineated the characteristic seasonal PM2.5 pattern in the region, with the highest levels in winter and the lowest in summer, alongside a clear decreasing annual trend, signifying gradual atmospheric improvement.
- Research Article
4
- 10.1007/s10661-024-13005-2
- Aug 29, 2024
- Environmental monitoring and assessment
Air pollution, particularly PM2.5, has long been a critical concern for the atmospheric environment. Accurately predicting daily PM2.5 concentrations is crucial for both environmental protection and public health. This study introduces a new hybrid model within the "Decomposition-Prediction-Integration" (DPI) framework, which combines variational modal decomposition (VMD), causal convolutional neural network (CNN), bidirectional long short-term memory (BiLSTM), and attention mechanism (AM), named as VCBA, for spatio-temporal fusion of multi-site data to forecast daily PM2.5 concentrations in a city. The approach involves integrating air quality data from the target site with data from neighboring sites, applying mathematical techniques for dimensionality reduction, decomposing PM2.5 concentration data using VMD, and utilizing Causal CNN and BiLSTM models with an attention mechanism to enhance performance. The final prediction results are obtained through linear aggregation. Experimental results demonstrate that the VCBA model performs exceptionally well in predicting daily PM2.5 concentrations at various stations in Taiyuan City, Shanxi Province, China. Evaluation metrics such as RMSE, MAE, and R2 are reported as 2.556, 1.998, and 0.973, respectively. Compared to traditional methods, this approach offers higher prediction accuracy and stronger spatio-temporal modeling capabilities, providing an effective solution for accurate PM2.5 daily concentration prediction.
- Research Article
131
- 10.1016/j.asoc.2020.106620
- Aug 6, 2020
- Applied Soft Computing
A novel hybrid model based on multi-objective Harris hawks optimization algorithm for daily PM2.5 and PM10 forecasting
- Research Article
3
- 10.1007/s44273-024-00048-7
- Dec 20, 2024
- Asian Journal of Atmospheric Environment
Accurate air pollution predictions in urban areas facilitate the implementation of efficient actions to control air pollution and the formulation of strategies to mitigate contamination. This includes establishing an early warning system to notify the public. Creating precise estimates for PM2.5 air pollutants in large cities is a challenging task because of the numerous relevant factors and quick fluctuations. This study introduces a novel hybrid model named STL-CNN-BILSTM-AM. It combines the seasonal-trend decomposition method with LOESS (STL) to simplify learning tasks and increase prediction accuracy for complex, nonlinear time-series data. Convolutional neural networks (CNNs) extract features from decomposed components of PM2.5 and other feature variables, such as pollutants and meteorological variables. Bidirectional long-short-term memory (BILSTM) uses these features to extract temporal relationships, enabling the forecasting of daily PM2.5 levels at four locations in Delhi. This hybrid model uses attention mechanisms to extract the most significant information, as well as Bayesian optimization to tune the hyperparameters. The suggested model greatly improved performance in all four regions used in this study, as evidenced by the findings. We compared it with the CNN-BILSTM, BILSTM, LSTM, and CNN models, and the suggested model outperformed the state-of-the-art models by utilizing STL decomposition components and other features. The overall results show that the STL-CNN-BILSTM-AM is better at predicting air quality, especially the concentration of PM2.5 in cities when the data has a high seasonal trend and is complex.Graphical
- Research Article
28
- 10.3390/toxics13040254
- Mar 28, 2025
- Toxics
Surface air pollution affects ecosystems and people’s health. However, traditional models have low prediction accuracy. Therefore, a hybrid model for accurately predicting daily surface PM2.5 concentrations was integrated with wavelet (W), convolutional neural network (CNN), bidirectional long short-term memory (BiLSTM), and bidirectional gated recurrent unit (BiGRU). The data for meteorological factors and air pollutants in Guangzhou City from 2014 to 2020 were utilized as inputs to the models. The W-CNN-BiGRU-BiLSTM hybrid model demonstrated strong performance during the predicting phase, achieving an R (correlation coefficient) of 0.9952, a root mean square error (RMSE) of 1.4935 μg/m3, a mean absolute error (MAE) of 1.2091 μg/m3, and a mean absolute percentage error (MAPE) of 7.3782%. Correspondingly, the accurate prediction of surface PM2.5 concentrations is beneficial for air pollution control and urban planning.
- Research Article
17
- 10.1016/j.chemosphere.2022.136252
- Aug 30, 2022
- Chemosphere
Extraction of multi-scale features enhances the deep learning-based daily PM2.5 forecasting in cities
- Research Article
4
- 10.1089/big.2022.0082
- May 3, 2023
- Big data
With the acceleration of urbanization, air pollution, especially PM2.5, has seriously affected human health and reduced people's life quality. Accurate PM2.5 prediction is significant for environmental protection authorities to take actions and develop prevention countermeasures. In this article, an adapted Kalman filter (KF) approach is presented to remove the nonlinearity and stochastic uncertainty of time series, suffered by the autoregressive integrated moving average (ARIMA) model. To further improve the accuracy of PM2.5 forecasting, a hybrid model is proposed by introducing an autoregressive (AR) model, where the AR part is used to determine the state-space equation, whereas the KF part is used for state estimation on PM2.5 concentration series. A modified artificial neural network (ANN), called AR-ANN is introduced to compare with the AR-KF model. According to the results, the AR-KF model outperforms the AR-ANN model and the original ARIMA model on the predication accuracy; that is, the AR-ANN obtains 10.85 and 15.45 of mean absolute error and root mean square error, respectively, whereas the ARIMA gains 30.58 and 29.39 on the corresponding metrics. It, therefore, proves that the presented AR-KF model can be adopted for air pollutant concentration prediction.
- Research Article
244
- 10.1016/j.atmosenv.2016.03.056
- Apr 1, 2016
- Atmospheric Environment
A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting
- Conference Article
3
- 10.23919/chicc.2019.8866134
- Jul 1, 2019
At present, there are serious air pollution problems in most cities in China. As one of the main atmospheric pollutants, PM 2.5 has caused serious harm to people’s health. In order to improve the accuracy of PM 2.5 concentration prediction, this paper proposes a new hybrid model based on complementary ensemble empirical mode decomposition (CEEMD) and Long Short-Term Memory (LSTM) to predict daily PM 2.5 concentration. The daily PM 2.5 concentration and meteorological data from January 2010 to December 2014 released by the US Embassy are selected as experimental data. Compared with extreme learning machine (ELM), Support Vector Regression (SVR) and Long Short-Term Memory (LSTM), the CEEMD-LSTM model shows a higher prediction ability.
- Research Article
20
- 10.1007/s13762-018-1999-x
- Sep 24, 2018
- International Journal of Environmental Science and Technology
Prediction of air pollutants in particular those related to PM10 has developed a huge interest in recent years, mainly due to its impact on environment and humans. There are a large number of factors that influence air pollutant prediction. The researcher has to select the most relevant one by combining different input variables combinations in order to find the combination that provides the best prediction by artificial neural network (ANN). In this work, applications of principal component analysis (PCA) are presented to solve the problem of selection of variables in the prediction of daily PM10. This method is tested by utilizing time series data of solar radiation, vertical wind speed, atmospheric pressure, PM2.5, benzene, NO and PM10 for Varanasi, India. The results obtained shows that PCA-ANN predicts daily PM10 with mean absolute percentage error (MAPE) of 9.88% and it predicts better than multiple linear regression models.
- Research Article
16
- 10.3390/atmos13122124
- Dec 17, 2022
- Atmosphere
A CNN+LSTM (Convolutional Neural Network + Long Short-Term Memory) based deep hybrid neural network was established for the citywide daily PM2.5 prediction in South Korea. The structural hyperparameters of the CNN+LSTM model were determined through comprehensive sensitivity tests. The input features were obtained from the ground observations and GFS forecast. The performance of CNN+LSTM was evaluated by comparison with PM2.5 observations and with the 3-D CTM (three-dimensional chemistry transport model)-predicted PM2.5. The newly developed hybrid model estimated more accurate ambient levels of PM2.5 compared to the 3-D CTM. For example, the error and bias of the CNN+LSTM prediction were 1.51 and 6.46 times smaller than those by 3D-CTM simulation. In addition, based on IOA (Index of Agreement), the accuracy of CNN+LSTM prediction was 1.10–1.18 times higher than the 3-D CTM-based prediction. The importance of input features was indirectly investigated by sequential perturbing input variables. The most important meteorological and atmospheric environmental features were geopotential height and previous day PM2.5. The obstacles of the current CNN+LSTM-based PM2.5 prediction were also discussed. The promising result of this study indicates that DNN-based models can be utilized as an effective tool for air quality prediction.
- Research Article
- 10.1289/isee.2020.virtual.o-os-486
- Oct 26, 2020
- ISEE Conference Abstracts
Background/Aim: Fine particles (PM2.5) are associated with a higher risk for coronary events. Cardiac troponin T (cTnT) is a myocardium-specific protein which is measured clinically for the diagnosis and prognosis of myocardial infarction (MI). An elevation in circulating cTnT also occurs in non-ischemic conditions and indicates myocardial damage. We aimed to investigate short-term PM2.5 effects on cTnT and other myocardial injury-related biomarkers among participants undergoing cardiac catheterizations.Methods: This study included 7,497 plasma cTnT measurements conducted in 2,739 participants presenting to Duke University Hospital (2000 to 2012), partly alongside with measurements of C-reactive protein, fibrinogen, white blood cells, N-terminal-pro brain natriuretic peptide (NT-pro BNP), and partial oxygen pressure (PaO2). Daily PM2.5 was predicted by a neural network-based hybrid model at a 1km resolution and was assigned to participants' residential addresses. We applied generalized estimating equations to assess associations of PM2.5 with biomarker levels and the risk of a positive cTnT test (cTnT>0.1ng/mL).Results: Mean PM2.5 concentration was 11.8 μg/m3. Median plasma cTnT was 0.05 ng/mL and the prevalence of a positive cTnT test was 35.6% at presentation. For a 10µg/m3 increase in PM2.5 one day before cTnT measurement, plasma cTnT increased by 11.1% (95% CI: 5.3–17.4) and the odds ratio of a positive cTnT test was 1.12 (95% CI: 1.03–1.23). Participants under 60 years [20.9% (95% CI: 10.2–32.6)] or living in rural areas [17.6% (95% CI: 7.3–28.7)] had stronger associations. There was additionally evidence for positive associations of PM2.5 with fibrinogen and NT-pro BNP within one day after exposure, as well as negative associations with PaO2 at lag 3-4 days. Conclusions: Our study suggests that acute PM2.5 exposure may elevate indicators of myocardial injury and exertion, which substantiates the association of air pollution exposure with adverse cardiovascular events. This abstract does not necessarily represent EPA policy.
- Research Article
209
- 10.1016/j.jenvman.2016.12.011
- Dec 15, 2016
- Journal of Environmental Management
Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm
- Research Article
19
- 10.1016/j.envpol.2021.116663
- Feb 5, 2021
- Environmental Pollution
Association between short-term exposure to ambient fine particulate matter and myocardial injury in the CATHGEN cohort.
- Research Article
33
- 10.1016/j.envres.2021.111342
- May 18, 2021
- Environmental Research
Placental gene networks at the interface between maternal PM2.5 exposure early in gestation and reduced infant birthweight
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.