Abstract

Imbalance learning is a typical problem in domain of machine learning and data mining. Aiming to solve this problem, researchers have proposed lots of the state-of-art techniques, such as Over Sampling, Under Sampling, SMOTE, Cost sensitive, and so on. However, the most appropriate methods on different learning problems are diverse. Given an imbalance learning problem, we proposed an Instance-based Learning (IBL) recommendation algorithm to present the most appropriate imbalance handling method for it. First, the meta knowledge database is created by the binary relation 〈data characteristic measures-the rank of all candidate imbalance handling methods〉 of each data set. Afterwards, when a new data set comes, its characteristics will be extracted and compared with the example in the knowledge database, where the instance-based k-nearest neighbors algorithm is applied to identify the rank of all candidate imbalance handling methods for the new dataset. Finally, the most appropriate imbalance handling method will be derived through combining the recommended rank and individual bias. The experimental results on 80 public binary imbalance datasets confirm that the proposed recommendation algorithm can effectively present the most appropriate imbalance handling method for a given imbalance learning problem, with the hit rate of recommendation up to 95%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call