Abstract

Learning with data streams has attracted much attention in recent decades. Conventional approaches typically assume that the feature and label of a data item can be timely observed at each round. In many real-world tasks, however, it often occurs that either the feature or the label is observed firstly while the other arrives with delay. For instance, in distributed learning systems, a central processor collects training data from different sub-processors to train a learning model, whereas the feature and label of certain data items can arrive asynchronously due to network latency. The problem of learning with asynchronous feature or label in streams encompasses many applications but still lacks sound solutions. In this paper, we formulate the problem and propose a new approach to alleviate the negative effect of asynchronicity and mining asynchronous data streams. Our approach carefully exploits the timely arrived information and builds an online ensemble structure to adaptively reuse historical models and instances. We provide the theoretical guarantees of our approach and conduct extensive experiments to validate its effectiveness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call