Quality control of online monitoring data of air pollutants using artificial neural networks

Ziyu Wang,Jingjing Feng,Jinping Cheng,Xiaojia Chen,Song Gao,Qingyan Fu

doi:10.1007/s11869-019-00734-4

Abstract

The intensive monitoring of air pollutants has led to the acquisition of vast quantities of data. Traditional quality control methods based on existing knowledge may be inefficient because of our limited understanding regarding the interaction of human activities and stochastic environmental factors. Moreover, traditional methods for outlier detection may be misleading because of the existence of valid outliers and invalid inliers. In this research, artificial neural networks (ANNs) are developed to identify instrument failure based on current and historical observations. Two structures, i.e., multilayer perceptrons and recurrent networks, are trained using 50,000 hourly data points labeled by human reviewers. The most conservative model identified 57.5% of the invalid sulfur compound observations and 44.9% of the invalid nitrogen compound observations. By setting a more liberal threshold, these values increased to 76.0% and 79.7%, respectively. Except for SO2, the ANNs outperformed the traditional methods for data quality control, as demonstrated with a plausibility test, a test of temporal consistency and a residential analysis. Compared with the test of temporal consistency, which was the most effective traditional method studied, the true positive rates of the ANNs were 19.4% to 29.5% higher for all pollutants except SO2, given the same false positive rates. The results indicate the effectiveness of ANNs for data quality control even without supplementary information. Methods for performance improvement are discussed.

Full Text