Abstract

Incrementally learning from large volumes of streaming data over time is a problem that is of crucial importance to the computational intelligence community, especially in scenarios where it is impractical or simply unfeasible to store all historical data. Learning becomes a particularly challenging problem when the probabilistic properties of the data are changing with time (i.e., gradual, abrupt, etc.), and there is scarce availability of class labels. Many existing strategies for learning in nonstationary environments use the most recent batch of training data to tune their parameters (e.g., calculate classifier voting weights), and never reassess these parameters when the unlabeled test data arrive. Making a limited drift assumption is generally one way to justify not needing to re-evaluate the parameters of a classifiers; however, labeled data that have already been learned if presented to the classifier for testing could be forgotten because the data was not observed for a long time. This is one form of abrupt concept drift with unlabeled data. In this work, an incremental spectral learning meta-classifier is presented for learning in nonstationary environments such that: (i) new classifiers can be added into an ensemble when labeled data are available, (ii) the ensemble voting weights are determined from the unlabeled test data to boost recollection of previously learned distributions of data, and (iii) the limited drift assumption is removed from the test-then-train evaluation paradigm. We benchmark our proposed approach on several widely used concept drift data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call