An instance-based learning recommendation algorithm of imbalance handling methods

Xueying Zhang,Ruixian Li,Bo Zhang,Yunxiang Yang,Jing Guo,Xiang Ji

doi:10.1016/j.amc.2018.12.020

Abstract

Imbalance learning is a typical problem in domain of machine learning and data mining. Aiming to solve this problem, researchers have proposed lots of the state-of-art techniques, such as Over Sampling, Under Sampling, SMOTE, Cost sensitive, and so on. However, the most appropriate methods on different learning problems are diverse. Given an imbalance learning problem, we proposed an Instance-based Learning (IBL) recommendation algorithm to present the most appropriate imbalance handling method for it. First, the meta knowledge database is created by the binary relation 〈data characteristic measures-the rank of all candidate imbalance handling methods〉 of each data set. Afterwards, when a new data set comes, its characteristics will be extracted and compared with the example in the knowledge database, where the instance-based k-nearest neighbors algorithm is applied to identify the rank of all candidate imbalance handling methods for the new dataset. Finally, the most appropriate imbalance handling method will be derived through combining the recommended rank and individual bias. The experimental results on 80 public binary imbalance datasets confirm that the proposed recommendation algorithm can effectively present the most appropriate imbalance handling method for a given imbalance learning problem, with the hit rate of recommendation up to 95%.

Full Text