Abstract

Abstract. The last 2 decades have seen substantial technological advances in the development of low-cost air pollution instruments using small sensors. While their use continues to spread across the field of atmospheric chemistry, the air quality monitoring community, and for commercial and private use, challenges remain in ensuring data quality and comparability of calibration methods. This study introduces a seven-step methodology for the field calibration of low-cost sensor systems using reference instrumentation with user-friendly guidelines, open-access code, and a discussion of common barriers to such an approach. The methodology has been developed and is applicable for gas-phase pollutants, such as for the measurement of nitrogen dioxide (NO2) or ozone (O3). A full example of the application of this methodology to a case study in an urban environment using both multiple linear regression (MLR) and the random forest (RF) machine-learning technique is presented with relevant R code provided, including error estimation. In this case, we have applied it to the calibration of metal oxide gas-phase sensors (MOSs). Results reiterate previous findings that MLR and RF are similarly accurate, though with differing limitations. The methodology presented here goes a step further than most studies by including explicit transparent steps for addressing model selection, validation, and tuning, as well as addressing the common issues of autocorrelation and multicollinearity. We also highlight the need for standardized reporting of methods for data cleaning and flagging, model selection and tuning, and model metrics. In the absence of a standardized methodology for the calibration of low-cost sensor systems, we suggest a number of best practices for future studies using low-cost sensor systems to ensure greater comparability of research.

Highlights

  • Air pollution remains a leading cause of premature death globally (Landrigan et al, 2018)

  • A full example of the application of this methodology to a case study in an urban environment using both multiple linear regression (MLR) and the random forest (RF) machine-learning technique is presented with relevant R code provided, including error estimation

  • To validate the MLR and RF models selected in Step 4, the co-location data were repeatedly split into training and testing subsets at a ratio of 75/25

Read more

Summary

Introduction

Air pollution remains a leading cause of premature death globally (Landrigan et al, 2018). LCSs have been used to develop or supplement existing air pollution monitoring networks to provide higher spatial resolution (e.g. CitiSense, U.S EPA Village Green), as well as in a citizen science context to report on and share information about air quality (e.g. AirVisual, Purple Air) (Morawska et al, 2018; Muller et al, 2015). Projects like these are a promising step towards empowering citizens with greater knowledge of their local air quality

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.