Abstract

The intensive monitoring of air pollutants has led to the acquisition of vast quantities of data. Traditional quality control methods based on existing knowledge may be inefficient because of our limited understanding regarding the interaction of human activities and stochastic environmental factors. Moreover, traditional methods for outlier detection may be misleading because of the existence of valid outliers and invalid inliers. In this research, artificial neural networks (ANNs) are developed to identify instrument failure based on current and historical observations. Two structures, i.e., multilayer perceptrons and recurrent networks, are trained using 50,000 hourly data points labeled by human reviewers. The most conservative model identified 57.5% of the invalid sulfur compound observations and 44.9% of the invalid nitrogen compound observations. By setting a more liberal threshold, these values increased to 76.0% and 79.7%, respectively. Except for SO2, the ANNs outperformed the traditional methods for data quality control, as demonstrated with a plausibility test, a test of temporal consistency and a residential analysis. Compared with the test of temporal consistency, which was the most effective traditional method studied, the true positive rates of the ANNs were 19.4% to 29.5% higher for all pollutants except SO2, given the same false positive rates. The results indicate the effectiveness of ANNs for data quality control even without supplementary information. Methods for performance improvement are discussed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.