Abstract

In a dynamic world, data streams are continuously generated, which poses immense challenges for machine learning (ML) algorithms to adapt to changing statistical properties that are subject to a non-stationary context. The underlying scenario is defined as concept drift (CD), where changes in the relationship between response and prediction variables (real CD) or a change in input data (virtual CD) are accompanied by a significant degradation in the predictive performance of the models, causing ML models to reach unacceptable levels of system accuracy. In this paper, the state of the art for CD algorithms is analyzed and compared. For this purpose, a systematic literature review was performed. Then, the 10 most popular CD algorithms were extracted from the literature using a newly-developed metric. Subsequently, the algorithms were analyzed and compared with respect to their functionality and limitations. Based on these, the optimization potentials were systematically derived. This work presents a summarized overview of CD algorithms and provides the basis for algorithm optimization in this domain.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call