Abstract

Concept drift referring to the changes of data distributions has been one critical challenge typically associated with mining data streams. Current drift detection and adaptation methods focus on how to immediately detect the distribution changes once the concept drift occurs and swiftly update the model to be applicable to the newly arrived data instances. Most of those methods assume the data does not have noise or the noise is too weak to affect the modeling procedure. However, realworld data are normally contaminated, and denoise techniques are highly preferred as a necessary preprocess. This issue is more complex for a data stream with concept drift because the noise is very likely to be confused with drift. Motivated by that, this paper proposes a Noise-tolerant Fuzzy c-means based drift Adaptation method (NFA) which can adapt to the changing distributions and is suitable for noisy data streams. The concept drift problem is solved by using a fuzzy c-means based regression model to continuously include the most relevant data instances to the latest pattern in the training set. In addition, a denoise technique is designed in NFA to remove noise, and the ability of incremental updating enables it to be embedded in the incremental drift adaptation process, and therefore NFA can solve concept drift and noise problems at the same time. Experimental evaluation results also show good performance of our method on handling data streams with concept drift and noise.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call