Abstract

Data stream mining has gained increasing attention in recent years due to its wide range of applications. In this paper, we propose a new selective prototype-based learning (SPL) method on evolving data streams, which dynamically maintains representative instances to capture the time-changing concepts, and make predictions in a local fashion. As an instance-based learning model, SPL only maintains some important prototypes (i.e., ISet) via error-driven representativeness learning. The fast condensed nearest neighbor (FCNN) rule, is further introduced to compress these prototypes, making the algorithm also applicable under memory constraints. To better distinguish noises from the instances associated with the new emerging concept, a potential concept instance set (i.e., PSet) is used to store all misclassified instances. Relying on the potential concept instance set, a local-aware distribution-based concept drift detection approach is proposed. SPL has several attractive benefits: (a) it can fit the evolving data streams very well by maintaining a small size of instance set; (b) it is capable of capturing both gradual and sudden concept drifts effectively; (c) it has great capabilities to distinguish noise/outliers from drifting instances. Experimental results show that the SPL has better classification performance than many other state-of-the-art algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call