Feature selection is one of the most relevant preprocessing and analysis techniques in machine learning. It can dramatically increase the performance of learning algorithms and at the same time provide relevant information on the data. In the scenario of online and stream learning, concept drift, i.e., changes in the underlying distribution over time, can cause significant problems for learning models and data analysis. While there do exist feature selection methods for online learning, none of the methods targets feature selection for drift detection, i.e., the challenge to increase the performance of drift detectors by analyzing the drift rather than increasing model accuracy. However, this challenge is particularly relevant for common unsupervised scenarios. In this work, we study feature selection for drift detection and drift monitoring. We develop a formal definition for a feature-wise notion of drift that allows semantic interpretation. Besides, we derive an efficient algorithm by reducing the problem to classical feature selection and analyze the applicability of our approach to feature selection for drift detection on a theoretical level. Finally, we empirically show the relevance of our considerations on several benchmarks.
Read full abstract