Abstract

With an explosive growth of data generated in the Internet and other fields, the data stream classification has sparked broad interest recently. Nowadays, some of the challenges in data streams, such as concept drift detection and supervised data stream classification, have been well-developed. However, when confronted with mixed data streams (containing categorical and numerical values) or limited available labeled samples, many data stream methods cannot achieve a satisfying performance or even cannot work. To tackle these two problems, we proposed an Incremental Semi-supervised Extreme Learning Machine for Mixed data stream classification (MIS-ELM). To be specific, for the issue of mixed data in data streams, we designed a novel soft one-hot encoding method by combining the coupling object similarity method and the one-hot encoding method, which can embed categorical data into high-quality numerical data and is used in the data preprocessing phase of MIS-ELM; for the issue of limited labeled samples, we introduced an incremental learning method based on unlabeled data, which is employed in the training classifier phase of MIS-ELM. When no concept drift occurs in the data stream, MIS-ELM uses only unlabeled data for incremental learning to fine-tune the classifier trained in the previous sliding window. Also, MIS-ELM instinctively inherits the fast computability of ELM, so it is very suitable for the real-time processing of data streams. Finally, we evaluated the representation performance of the soft one-hot encoding and the classification performance of MIS-ELM, within real data streams. The experimental results demonstrate the superiority of the proposed methods over the state-of-the-art techniques in their areas, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call