Abstract

Abstract. Associativity analysis is a powerful tool to deal with large-scale datasets by clustering the data on the basis of (dis)similarity and can be used to assess the efficacy and design of air quality monitoring networks. We describe here our use of Kolmogorov–Zurbenko filtering and hierarchical clustering of NO2 and SO2 passive and continuous monitoring data to analyse and optimize air quality networks for these species in the province of Alberta, Canada. The methodology applied in this study assesses dissimilarity between monitoring station time series based on two metrics: 1−R, R being the Pearson correlation coefficient, and the Euclidean distance; we find that both should be used in evaluating monitoring site similarity. We have combined the analytic power of hierarchical clustering with the spatial information provided by deterministic air quality model results, using the gridded time series of model output as potential station locations, as a proxy for assessing monitoring network design and for network optimization. We demonstrate that clustering results depend on the air contaminant analysed, reflecting the difference in the respective emission sources of SO2 and NO2 in the region under study. Our work shows that much of the signal identifying the sources of NO2 and SO2 emissions resides in shorter timescales (hourly to daily) due to short-term variation of concentrations and that longer-term averages in data collection may lose the information needed to identify local sources. However, the methodology identifies stations mainly influenced by seasonality, if larger timescales (weekly to monthly) are considered. We have performed the first dissimilarity analysis based on gridded air quality model output and have shown that the methodology is capable of generating maps of subregions within which a single station will represent the entire subregion, to a given level of dissimilarity. We have also shown that our approach is capable of identifying different sampling methodologies as well as outliers (stations' time series which are markedly different from all others in a given dataset).

Highlights

  • Air quality monitoring networks are established to obtain objective, reliable, and comparable information on the air quality of a specific area, and they serve the purposes of supporting measures to reduce impacts on human health and the natural environment, monitoring specific sources, and documenting air quality trends over time

  • We extend the methodology to a new application of gridded air quality model data – showing that time series from a deterministic air quality model (Global Environmental Multiscale – Modelling Air-quality and Chemistry; GEM-MACH) may be used as a surrogate for observations in air quality clustering analysis

  • With favourable comparisons to clustering results from air quality monitoring station observations, we show that model output combined with hierarchical clustering provides a new approach for monitoring network design

Read more

Summary

Introduction

Air quality monitoring networks are established to obtain objective, reliable, and comparable information on the air quality of a specific area, and they serve the purposes of supporting measures to reduce impacts on human health and the natural environment, monitoring specific sources, and documenting air quality trends over time. Note that the data were pre-filtered by iterative moving averages (Kolmogorov– Zurbenko (KZ) filtering; Zurbenko, 1986) to assess the similarity of the spectral components of the hourly time series, independent of station location or monitoring technology employed, without a requirement of prior knowledge of the study area Their analysis investigated the extent to which concentration time series similarities between the air quality monitoring stations were defined by areas with specific chemical regimes and/or predominant air masses versus by country borders and/or monitoring network jurisdiction.

Study area
Monitoring data
Modelling output
Separating different timescales using KZ filtering
Assessing potential station redundancy
Spatial distribution of clusters
Ranking of stations by dissimilarity
Hierarchical clustering to cross-compare methodologies and technologies
Potential factors impacting the analysis
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.