Balancing large margin nearest neighbours for imbalanced data

Xiaotian Zhang,Ping Huang,Jing Peng,Kai Zhou,Chang‐An Yuan,Yongqing Zhang,Shaojie Qiao,Nan Han

doi:10.1049/joe.2019.1178

Abstract

It is critical to learn and obtain a good distance metric that can precisely measure the distance between samples in imbalanced data. However, traditional metric learning algorithms, e.g. large margin nearest neighbour (LMNN), information-theoretic metric learning, neighbourhood component analysis, do not take imbalanced distributions of classes into consideration. The traditional methods are apt to be affected by the majority samples, so those important minority samples are often ignored during the learning phase of distance metrics matrix, this may gravely confuse decision-making systems on classifying samples. In order to resolve this problem, the authors propose a novel metric-learning method named balancing large margin nearest neighbour (BLMNN) for imbalanced data. BLMNN can improve the objective function according to the distribution of classes, which treats the minority and majority classes equally during the optimisation process. Thus, the contribution of minority class is taken into full consideration, which can greatly improve the accuracy of classification. Substantial experiments were performed on real-world imbalanced datasets. The experiments results in various evaluation indexes of the proposed method comparing it with other metric-learning methods show the advantages of the proposed method.

Full Text