Abstract

The class imbalance problem is widely studied in the machine learning community, it is present in many real world applications such as spam filtering, anomaly detection and medical diagnosis. In this paper, we propose an adaptive fuzzy c-means based consensus clustering approach for class imbalanced learning, the number of base clusters are determined through a balancing optimization approach while the initial starting points for each base partition is determined in a sequential manner. The final partition is constructed via the co-association matrix. Finally, the center samples in the final cluster partition are selected to form the reduced data set with the minority class samples. In this way, the most representative majority class samples are chosen while the boundary samples are eliminated. The validity of the proposed method is tested with real world data sets which demonstrates superior performance compared to other clustering based re-sampling schemes. Thus, the fuzzy consensus clustering based under-sampling method can be used for real life imbalanced problems.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call