Abstract

PurposeOne of the significant problems in data stream classification is the concept drift phenomenon, which consists of the change in probabilistic characteristics of the classification task. Such changes in posterior probability destabilize the classification model performance, seriously degrading its quality. It is necessary to design appropriate strategies to counteract this phenomenon, allowing the classifier to adapt to the changing probabilistic characteristics. It is tough to propose such an approach with limited access to data labels. A human bias of high quality is usually costly, so to minimize the expenses related to this process, it is also necessary to propose learning strategies based on semi-supervised learning. Such strategies employ active learning methods indicating which of the incoming objects are valuable to be labeled for improving the classifier's performance. MethodsThis paper proposes Active Weighted Aging Ensemble algorithm, a novel chunk-based method for non-stationary data stream classification. It employs a classifier ensemble approach and utilizes the changing ensemble lineup to react to concept drift appropriately. It also proposed a new active learning method, considering a limited budget that may be applied to any data stream classifier. ResultsAWAE has been evaluated through computer experiments using real and synthetic data streams, confirming the proposed algorithm's high quality over state-of-the-art methods. ConclusionThe research conducted on benchmark data streams confirmed the effectiveness of the proposed solution and highlighted its strengths in comparison with state-of-the-art methods. The estimated computational complexity is acceptable and comparable to the benchmark algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call