Abstract

Machine learning algorithms have been widely used in IoT application systems for predicting labels of input data streams from IoT sensors. Since data streams are not always stationary in real-world scenarios, underlying data distribution changes may cause a deterioration of the prediction performance that is known as concept drift. Existing concept drift detection methods often assume that complete and true labels for detecting the prediction errors are always available immediately after the prediction. However, such an assumption is not realistic in real IoT application systems. This paper experimentally investigates the robustness of six representative concept drift detectors against unreliable data streams containing the data with error labels and the data with no labels. Robustness is evaluated in terms of the average detection delay and precision by increasing the ratios of error-labeled and missing-label data in a synthetic data stream. Our experimental results show that Cumulative Sum (CUSUM), Page-Hinkley (PH), and Drift Detection Method (DDM) achieve relatively stable performances when the ratio of error-labeled data is less than 40%. With respect to the drift detection efficiency, CUSUM can be regarded as the most robust detector among other detectors tested in our experiments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.