Reduced-Space Multistream Classification Based on Multiobjective Evolutionary Optimization

Botao Jiao,Dunwei Gong,Jiayang Pu,Shengxiang Yang,Yinan Guo

doi:10.1109/tevc.2022.3232466

Botao Jiao, Dunwei Gong + Show 3 more

Open Access

PDF Available

https://doi.org/10.1109/tevc.2022.3232466

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

In traditional data stream mining, classification models are typically trained on labeled samples from a single source. However, in real-world scenarios, obtaining accurate labels is very hard and expensive, especially when multiple data streams are concurrently sampled from an environment or the same process. To address this issue, multistream classification is proposed, in which a data stream with biased labels (called the source stream) is leveraged to train a suitable model for prediction over another stream with unlabeled samples (called the target stream). Despite the growing research in this field, previous multistream classification methods are mostly designed for single source stream scenarios. However, various source streams contain diverse data distributions, providing more valuable information for building a more accurate model. In addition, previous works construct classification models in the original shared feature space, ignoring the effect of redundant or low-quality features on the classification performance. This may produce inefficient knowledge transfer across streams. In view of this, a reduced-space multistream classification based on multi-objective evolutionary optimization is proposed in this paper. First, a multi-objective evolutionary optimization is employed to seek the most valuable feature subset shared in the source and target domains, with the purpose of narrowing the distribution difference between source and target streams. Following that, a Gaussian Mixture Model-based weighting mechanism for source samples is presented. More especially, two drift adaptation methods are proposed to address asynchronous drift. Experimental results on benchmark datasets show that the proposed method outperforms other comparative methods on classification accuracy and G-mean.

Full Text