Abstract
In this article, we consider the semi-supervised data stream classification problems. Most of the semi-supervised learning algorithms suffer from a proper selection metric to select from the newly-labeled data points through the training procedure. These approaches mainly employ the probability estimation of the underlying base learners to their predictions as a selection metric, which is not optimal in many cases. Handling different kinds of concept drifts is another issue in data streams. Considering these issues, we propose a novel Semi-Supervised Ensemble algorithm using a Performance-Based Selection metric to data streams, named SSE-PBS. The proposed selection metric is based on a pseudo-accuracy and energy regularization factor. We show that SSE-PBS improves classification performance and handles different kinds of concept drifts. The proposed algorithm can also employ any kind of incremental base learners. In the experiments, we report the results of two base learners on synthetic and real-world datasets. The experiments show that SSE-PBS significantly improves the classification performance of the used underlying base learners. Furthermore, we compare the results to the state-of-the-art supervised and semi-supervised approaches in data streams. The results further show that SSE-PBS outperforms the other methods when there is a small portion of labeled instances.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.