Boosting classifiers for drifting concepts

Martin Scholz ,Ralf Klinkenberg

doi:10.17877/de290r-14320

Abstract

In many real-world classification tasks, data arrives over time and the target concept to be learned from the data stream may change over time. Boosting methods are well-suited for learning from data streams, but do not address this concept drift problem. This paper proposes a boosting-like method to train a classifier ensemble from data streams that naturally adapts to concept drift. Moreover, it allows to quantify the drift in terms of its base learners. Similar as in regular boosting, examples are re-weighted to induce a diverse ensemble of base models. In order to handle drift, the proposed method continuously re-weights the ensemble members based on their performance on the most recent examples only. The proposed strategy adapts quickly to different kinds of concept drift. The algorithm is empirically shown to outperform learning algorithms that ignore concept drift. It performs no worse than advanced adaptive time window and example selection strategies that store all the data and are thus not suited for mining massive streams. The proposed algorithm has low computational costs.

Full Text