Abstract

Learning non-stationary data streams is challenging due to the unique characteristics of infinite length and evolving property. Current existing works often concentrate on the concept-drift problem in data streams. Concept evolution, indicating novel classes are emerged in data streams, has gained growing attention recently due to its practical values in many real-world applications. Thereby, how to design a new robust learning model on data streams to handle concept drift, concept evolution and outliers simultaneously, is of significant importance. To this end, we propose a new data stream classification approach, called EMC, which dynamically learns the Evolving Micro-Clusters to examine both concept drift and evolution. Specifically, to capture time-changing concept, EMC dynamically maintains a set of online micro-clusters and learns their importance with error-based representative learning. Building upon the evolving micro-clusters, the novel class detector is introduced based on a local density perspective, which allows handling the data streams with complex class distribution. Beyond, EMC allows distinguishing concept drift and evolution from noisy instances. Extensive experiments on both synthetic and real-world data sets show that our method has good classification and novel class detection performance compared to state-of-the-art algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call