Construction of EBRB classifier for imbalanced data based on Fuzzy C-Means clustering

Geng-Geng Liu,Ying-Ming Wang,Long-Jiang Chen,Yang-Geng Fu,Ji-Feng Ye,Ze-Feng Yin

doi:10.1016/j.knosys.2021.107590

Abstract

The Extended Belief Rule-Based (EBRB) system has been widely used to solve the real-world problems concerning with incompleteness, uncertainty, and ambiguity. However, EBRB is essentially a data-driven method, in which each rule is obtained from training data. Therefore, the generated extended belief rules may be severely biased when dealing with data with imbalanced classes. In this case, the number of the rules generated by the samples of majority classes (i.e., negative samples) may be much larger than those of minority classes (i.e., positive samples). Thus, the class imbalance may lead to significant biases in system decision-making. In order to resolve this problem, this paper proposes a novel EBRB system based on fuzzy C-means clustering (FCM-EBRB). First, we adopt FCM clustering to oversample the positive samples and undersample the negative ones, so as to achieve the balance between them. Next, this paper improves the construction method of EBRB and optimizes the system through an efficient parameter learning strategy. Finally, this paper conducts comprehensive comparison experiments on a binary classification synthetic dataset and 11 commonly used KEEL public class imbalance datasets. Experimental results show that the proposed method can effectively reduce the scale of the rule base and achieve high inference accuracy, especially for imbalanced data.

Full Text