A large-scale comparison of concept drift detectors

Roberto Souto Maior Barros,Silas Garrido T Carvalho Santos

doi:10.1016/j.ins.2018.04.014

Abstract

Online learning involves extracting information from large quantities of data (streams) usually affected by changes in the distribution (concept drift). A drift detector is a small program that estimates the positions of these changes to replace the base learner and ultimately improve overall accuracy. This article reports on a large-scale comparison of 14 concept drift detector configurations for mining fully labeled data streams with concept drift, using a large number of artificial datasets and two different base classifiers (Naive Bayes and Hoeffding Tree). The goal is to adequately measure how good the existent concept drift detectors really are and also to verify and challenge a common belief in the area, namely that the best drift detection methods are necessarily those that detect all the existing drifts closer to their correct positions, and only them, irrespective of the fact that different objectives usually require alternative solutions. Finally, to some extent, this article may also be seen as an extensive literature survey of concept drift detectors.

Full Text