Abstract

Class-imbalance learning is a classic problem in data mining and machine learning community. In class-imbalance learning, the idea is to learn the model so that it performs equally well on all the classes. Most of the work in literature so far have tackled this problem either in a centralized way or the work is limited to a particular domain such as intrusion detection. In the present paper, we propose to solve the class-imbalance learning problem on large-scale sparse data in a distributed setting. More specifically, we partition the data across examples and distribute each chunk of the data to different processing nodes. Each node runs a local copy of FISTA-like algorithm which is a distributed implementation of the prox-linear algorithm for cost-sensitive learning. We show the efficacy of the proposed approach on benchmark and real-world data sets and compare the performance with the state-of-the-art techniques in the literature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call