Abstract

Learning from data streams is among the contemporary challenges in the machine learning domain, which is frequently plagued by the class imbalance problem. In non-stationary environments, ratios among classes, as well as their roles (majority and minority) may change over time. The class imbalance is usually alleviated by balancing classes with resampling. However, this suffers from limitations, such as a lack of adaptation to concept drift and the possibility of shifting the true class distributions. In this paper, we propose a novel ensemble approach, where each new base classifier is built using a low-dimensional embedding. We use class-dependent entropy linear manifold to find the most discriminative low-dimensional representation that is, at the same time, skew-insensitive. This allows us to address two challenging issues: (i) learning efficient classifiers from imbalanced and drifting streams without data resampling; and (ii) tackling simultaneously high-dimensional and imbalanced streams that pose extreme challenges to existing classifiers. Our proposed low-dimensional representation algorithm is a flexible plug-in that can work with any ensemble learning algorithm, making it a highly useful tool for difficult scenarios of learning from high-dimensional imbalanced and drifting data streams.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call