Abstract

Air quality status plays a vital indicator towards environment, economy as well as human health impacts of a country. Therefore, characterizing the spatial air quality monitoring stations is critical as it can be grouped into clusters in order to minimize the numbers of stations. It could be more efficient towards human health monitoring by strengthening management and reducing costs. However, the assessment of air quality pattern requires multiple variables to be analysed. It becomes a multivariate problem that requires novel methodologies as agglomerative hierarchical cluster analysis (AHC) is used primarily to determine cluster pattern [1]. It becomes a multidimensional problem that necessitates unique approaches, known as hybrid cluster technique. Therefore, this research aimed to show the hybrid cluster framework in air quality monitoring stations in north region of Peninsular Malaysia in order to give a superior geographic cluster distribution with a distinct validation.
 
 The data set for all air quality monitoring stations in north region (14 stations) was obtained from the Department of Environment, Malaysia (DOE) for the years 2018 to 2019 (two years). Six air quality pollutants (PM10, PM2.5, SO2, NO2, O3 and CO) were involved in this study. Before clustering the data, chemometric techniques such as principal component analysis (PCA) was used to summarize the information content in huge data tables in order to gain a better understanding of the variables in order to reduce dimensionality. The AHC was then created using the PCA factor scores. The factor scores were employed in a discriminant analysis (DA) to verify the clusters [2].
 
 Results from PCA factor scores showed 12 out of 14 stations needed to be further investigated using AHC. High Polluted Region (HPR=three stations), Moderate Polluted Region (MPR= four stations) and Low Polluted Region (LPR=five stations) were established from AHC and have the same characteristics as shown in Figure 1. Each class was distinguished using discriminant analysis (DA) for verification. The researchers discovered that each class has its own set of variables. Based on the DA, the confusion matrix for the clusters was 87.66 % indicating a high percentage of correct classification obtained from the validated data set (Table 1).
 The framework presented here is a novel tool for identifying and categorising stations based on air quality pollutants. The hybrid cluster analysis technique used in this study can produce more precise pollutant distributions, which are useful in air pollution research. In addition, the study could help to improve the existing methodology for air quality assessment in a more simplified way and a better evaluation of the air quality status, and thus could become an alternative way for analysis of changes in air quality, especially in the absence or limited historical data, in response to a better and more sustainable indicator in air quality assessment and management.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call