Abstract
Ensemble techniques are a powerful method for recognising and reacting to changes in non-stationary data. However, most researches into dynamic classification with ensembles assume that the true class label of each incoming point is available or easily obtained. This is unrealistic in most practical applications, especially in high-velocity streams where manually labeling each point is prohibitively expensive. To address this challenge, this paper proposes an algorithm, named Clustering and One-Class Classification Ensemble Learning (COCEL), which incorporates a stream clustering algorithm and an ensemble of one-class classifiers with active learning, for classification in dynamic data streams. The method exploits the intuitive relationship between clusters and one-class classifiers to cope with a small training set (or no training set) and improve with experience, self-modifying its internal state to cope with changes in the data stream. The proposed method is evaluated on synthetic data streams exhibiting concept evolution and concept drift and a collection of high-velocity real data streams where manually labeling each incoming point is infeasible or expensive and labor intensive. Finally, a comparative evaluation with peer stream classification ensembles shows that COCEL can achieve superior or comparative accuracy while typically requiring less than 0.01% of the stream labels.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Knowledge and Data Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.