Temporal heterogeneity in the performance of machine learning models for PM2.5 concentration estimation

Peizheng Li,Jing Wang,Jianpeng Liao,Can Yang,Xiangying Li,Qingyu Zhang,Haomin Yang,Shiqi Huang,Lu Ma,Chenxi Luo,Qihao Chen

doi:10.1016/j.psep.2024.06.115

Abstract

Machine learning (ML) methods have been applied extensively to simulate air pollutant concentrations and assess individual exposure in epidemiological studies. However, there is still a paucity of research on the temporal heterogeneity of ML model performance and the impact of dataset size. To explore the temporal heterogeneity in model performance when estimating daily concentrations of fine particulate matter (PM2.5) across China in 2021, we compared five decision tree-based ML models (Random Forest (RF), Categorical Boosting (CatBoost), Gradient Boost Regression Tree (GBRT), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM)) across daily scales within three distinct timeframes. The performance of all models was evaluated using cross-validation. We observed that the performance of ML models varied with time, which showed a significant correlation with PM2.5 concentration. Among the 365 days in 2021, RF model performed best, the annual mean R2 was 0.86, with a minimum of 0.84 and a maximum of up to 0.95. For RF, we chose a cubic polynomial curve to fit the relationship between model performance and PM2.5 concentrations, and based on this, we devised a model selection strategy for different time scales, achieving an accuracy rate of up to 79.45 %, with the selected models having an average R2 of 0.85, and a maximum of up to 0.95. Additionally, we found that increasing the dataset size did not significantly improve model performance. Instead, it resulted in considerably longer runtime and increased memory usage. The methodology and findings of this study hold significant value for advancing the development of more efficient and precise modeling approaches for air pollutant concentrations. Furthermore, this research provides a foundation for regional air pollutant governance and future health-related research.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Process Safety and Environmental Protection	Publication Date: Jun 26, 2024
Citations: 1	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Temporal heterogeneity in the performance of machine learning models for PM2.5 concentration estimation

Abstract

Talk to us

Similar Papers

More From: Process Safety and Environmental Protection

Lead the way for us

Similar Papers

Analysis of the Performance of Machine Learning Models in Predicting the Severity Level of Large-Truck Crashes
Jinli Liu ... Tao Tao
Future Transportation | VOL. 2
Jinli Liu, et. al.Jinli Liu ... Tao Tao
16 Nov 2022
Future Transportation | VOL. 2

Developing Tree-Based Machine Learning Models for Estimating the Pile Setup Parameter for Clay Soils
Mohammad Moontakim Shoaib ... Murad Y Abu-Farsakh
Transportation Research Record: Journal of the Transportation Research Board | VOL. -
Mohammad Moontakim Shoaib, et. al.Mohammad Moontakim Shoaib ... Murad Y Abu-Farsakh
21 Mar 2024
Transportation Research Record: Journal of the Transportation Research Board | VOL. -

Explainable machine learning models for predicting the axial compression capacity of concrete filled steel tubular columns
Celal Cakiroglu ... Sujith Mangalathu
Construction and Building Materials | VOL. 356
Celal Cakiroglu, et. al.Celal Cakiroglu ... Sujith Mangalathu
02 Oct 2022
Construction and Building Materials | VOL. 356

Advanced tree-based machine learning methods for predicting the seismic response of regular and irregular RC frames
Ahmet Demir ... Selçuk Demir
Structures | VOL. 64
Ahmet Demir, et. al.Ahmet Demir ... Selçuk Demir
11 May 2024
Structures | VOL. 64

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Temporal heterogeneity in the performance of machine learning models for PM2.5 concentration estimation

Abstract

Talk to us

Similar Papers

More From: Process Safety and Environmental Protection