Abstract

In a streaming environment, the characteristics and labels of instances may change over time, forming concept drifts. Previous studies on data stream learning generally assume that the true label of each instance is available or easily obtained, which is impractical in many real-world applications due to expensive time and labor costs for labeling. To address the issue, an active broad learning based on multi-objective evolutionary optimization is presented to classify non-stationary data stream. The instance newly arrived at each time step is stored to a chunk in turn. Once the chunk is full, its data distribution is compared with previous ones by fast local drift detection to seek potential concept drift. Taking diversity of instances and their relevance to new concept into account, multi-objective evolutionary algorithm is introduced to find the most valuable candidate instances. Among them, representative ones are randomly selected to query their ground-truth labels, and then update broad learning model for drift adaption. More especially, the number of representative is determined by the stability of adjacent historical chunks. Experimental results for 7 synthetic and 5 real-world datasets show that the proposed method outperforms five state-of-the-art ones on classification accuracy and labeling cost due to drift regions accurately identified and the labeling budget adaptively adjusted.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call