Predicting Fine Particulate Matter (PM2.5) in the Greater London Area: An Ensemble Approach using Machine Learning Methods

Mahdieh Danesh Yazdi,Benjamin Barratt,Joel Schwartz,Alexei Lyapustin,Zheng Kuang,Konstantina Dimakopoulou,Klea Katsouyanni,Esra Suel,Heresh Amini

doi:10.3390/rs12060914

Abstract

Estimating air pollution exposure has long been a challenge for environmental health researchers. Technological advances and novel machine learning methods have allowed us to increase the geographic range and accuracy of exposure models, making them a valuable tool in conducting health studies and identifying hotspots of pollution. Here, we have created a prediction model for daily PM2.5 levels in the Greater London area from 1st January 2005 to 31st December 2013 using an ensemble machine learning approach incorporating satellite aerosol optical depth (AOD), land use, and meteorological data. The predictions were made on a 1 km × 1 km scale over 3960 grid cells. The ensemble included predictions from three different machine learners: a random forest (RF), a gradient boosting machine (GBM), and a k-nearest neighbor (KNN) approach. Our ensemble model performed very well, with a ten-fold cross-validated R2 of 0.828. Of the three machine learners, the random forest outperformed the GBM and KNN. Our model was particularly adept at predicting day-to-day changes in PM2.5 levels with an out-of-sample temporal R2 of 0.882. However, its ability to predict spatial variability was weaker, with a R2 of 0.396. We believe this to be due to the smaller spatial variation in pollutant levels in this area.

Highlights

Environmental research has long dealt with issues in exposure assessment, in studies involving air pollutants
We incorporate aerosol optical depth (AOD), land-use data, and meteorological data to predict PM2.5 levels on a 1 km × 1 km scale, from 1st January 2005 to 31st December 2013 in the Greater London area, using an ensemble model and four machine learning algorithms, which were calibrated using data derived from a wide network of monitors
Elevation data was obtained from the CGIAR Consortium for Spatial Information, who used Shuttle Radar Topography Mission (SRTM) data from the United States Geological Survey (USGS) and NASA

Summary

Introduction

Environmental research has long dealt with issues in exposure assessment, in studies involving air pollutants. Novel machine learning techniques allow us to create models with greater accuracy and flexibility that can combine remote sensing, land use, meteorological, and CTM inputs They are better at incorporating temporal variation than standard LURs. Machine learning algorithms allow us to non-parametrically examine the relationship between the predictors of pollutant concentrations and measured pollutant concentrations [28,31,32,33,34,35,36]. We incorporate AOD, land-use data, and meteorological data to predict PM2.5 levels on a 1 km × 1 km scale, from 1st January 2005 to 31st December 2013 in the Greater London area, using an ensemble model and four machine learning algorithms, which were calibrated using data derived from a wide network of monitors

Materials and Methods

Machine Learning Algorithms

Input Variables

Data Sources

Predictions

Results

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Remote Sensing	Publication Date: Mar 12, 2020
Citations: 75	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Predicting Fine Particulate Matter (PM2.5) in the Greater London Area: An Ensemble Approach using Machine Learning Methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing

Lead the way for us

Similar Papers

Single classifier vs. ensemble machine learning approaches for mental health prediction
Jason Teo ... Jetli Chung
Brain Informatics | VOL. 10
Jason Teo, et. al.Jason Teo ... Jetli Chung
03 Jan 2023
Brain Informatics | VOL. 10

AUD-DSS: a decision support system for early detection of patients with alcohol use disorder
Ruben Baskaran ... Kjeld Andersen
BMC Bioinformatics | VOL. 24
Ruben Baskaran, et. al.Ruben Baskaran ... Kjeld Andersen
02 Sep 2023
BMC Bioinformatics | VOL. 24

Machine learning approach for predicting the antifungal effect of gilaburu (Viburnum opulus) fruit extracts on Fusarium spp. isolated from diseased potato tubers
Tugba Dursun Capar ... Alper Zongur
Journal of Microbiological Methods | VOL. 192
Tugba Dursun Capar, et. al.Tugba Dursun Capar ... Alper Zongur
19 Nov 2021
Journal of Microbiological Methods | VOL. 192

Investigations into the development of a satellite-based aerosol climate data record using ATSR-2, AATSR and AVHRR data over north-eastern China from 1987 to 2012
Jie Guang ... Yahui Che
Atmospheric Measurement Techniques | VOL. 12
Jie Guang, et. al.Jie Guang ... Yahui Che
26 Jul 2019
Atmospheric Measurement Techniques | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting Fine Particulate Matter (PM2.5) in the Greater London Area: An Ensemble Approach using Machine Learning Methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing