Abstract
Detecting concept drifts and reducing the impact from the noise in real applications of data streams are challenging but valuable for inductive learning. It is especially a challenge in a light demand on the overheads of time and space. However, though a great number of inductive learning algorithms based on ensemble classification models have been proposed for handling concept drifting data streams, little attention has been focused on the detection of the diversity of concept drifts and the influence from noise in data streams simultaneously. Motivated by this, we present a new light-weighted inductive algorithm for concept drifting detection in virtue of an ensemble model of random decision trees (named CDRDT) to distinguish various types of concept drifts from noisy data streams in this article. We use variably small data chunks to generate random decision trees incrementally. Meanwhile, we introduce the inequality of Hoeffding bounds and the principle of statistical quality control to detect the different types of concept drifts and noise. Extensive studies on synthetic and real streaming data demonstrate that CDRDT could effectively and efficiently detect concept drifts from the noisy streaming data. Therefore, our algorithm provides a feasible reference framework of classification for concept drifting data streams with noise.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.