Abstract

Satellite-derived estimates of aerosol optical depth (AOD) are key predictors in particulate air pollution models. The multi-step retrieval algorithms that estimate AOD also produce quality control variables but these have not been systematically used to address the measurement error in AOD. We compare three machine-learning methods: random forests, gradient boosting, and extreme gradient boosting (XGBoost) to characterize and correct measurement error in the Multi-Angle Implementation of Atmospheric Correction (MAIAC) 1 × 1 km AOD product for Aqua and Terra satellites across the Northeastern/Mid-Atlantic USA versus collocated measures from 79 ground-based AERONET stations over 14 years. Models included 52 quality control, land use, meteorology, and spatially-derived features. Variable importance measures suggest relative azimuth, AOD uncertainty, and the AOD difference in 30-210 km moving windows are among the most important features for predicting measurement error. XGBoost outperformed the other machine-learning approaches, decreasing the root mean squared error in withheld testing data by 43% and 44% for Aqua and Terra. After correction using XGBoost, the correlation of collocated AOD and daily PM2.5 monitors across the region increased by 10 and 9 percentage points for Aqua and Terra. We demonstrate how machine learning with quality control and spatial features substantially improves satellite-derived AOD products for air pollution modeling.

Highlights

  • A useful public health application of satellite remote sensing is to augment sparse monitoring networks and cover large time and space domains when modeling particulate matter for epidemiologic health studies [1]

  • The measurement error correction datasets with collocated AErosol RObotic NETwork (AERONET) aerosol optical thickness (AOT) and Multi-Angle Implementation of Atmospheric Correction (MAIAC) aerosol optical depth (AOD) used in this analysis included 8531 and 10,278 observations at the AERONET sites for Aqua and Terra, respectively

  • The additional performance of the gradient boosting approaches GBM and XGBoost over the simpler Random Forest (RF) approach requires additional model hyperparameters related to the learning rate and feature subsampling (XGBoost only), but our model tuning suggests that the improved performance of the XGBoost may be due to those additional model characteristics

Read more

Summary

Introduction

A useful public health application of satellite remote sensing is to augment sparse monitoring networks and cover large time and space domains when modeling particulate matter for epidemiologic health studies [1]. Recent refinements in remote sensing algorithms have resulted in higher resolution products such as the 1 × 1 km resolution Multi-Angle Implementation of Atmospheric Correction (MAIAC) retrieval algorithm estimating the Aerosol Optical Depth (AOD) as a measure of the density of light scattering particles in the atmospheric column [2,3]. While previous work has compared the agreement of MAIAC with earlier MODIS 10 × 10 km AOD products and ground monitoring data in different regions and seasons by stratifying the dataset [7], little work has been done to comprehensively understand and correct for measurement error in the MAIAC AOD product. The AErosol RObotic NETwork (AERONET) is a standardized ground-based remote sensing network for measuring aerosol optical depth with a cloud-screened and quality assured data record that is frequently used as a validation for satellite-based AOD products [8]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.