Abstract

AbstractIn geophysics, crowdsourcing is an emerging nontraditional environmental monitoring approach that supports data acquisition from individual citizens. However, because of the involvement of undertrained citizens and imprecise low‐cost sensors, crowdsourced data applications suffer from different types of noises that can deteriorate the overall monitoring accuracy. In this study, we propose a machine learning approach for automatic crowdsourced data quality control (CSQC) that detects and removes noisy data inputs in spatially and temporally discrete crowdsourced observations coming from both fixed‐point sensors (e.g., surveillance cameras) and moving sensors (e.g., moving cars/pedestrians). We design a set of features from original and interpolated rainfall data and use them to train and test the CSQC models using both supervised and unsupervised machine learning algorithms. The performances of the CSQC models under various scenarios assuming no retraining are also tested (hereafter referred to as transferability). The results based on synthetic but realistic data show that the CSQC models can significantly reduce the overall rainfall estimate errors. Under the stationary assumption, the CSQC models based on both supervised and unsupervised algorithms perform well in noisy data identification and overall rainfall estimation error reduction; however, if the model is transferred to other cities with different rainfall patterns or noise compositions (without retraining), supervised multilayer perceptrons (MLPs) show the best performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.