Oversampling the minority class using a dedicated fitness function and genetic algorithmic progression

Payel Sadhukhan,Sarbani Palt

doi:10.1002/cpe.6648

Abstract

SummaryClass imbalance is a pertinent characteristic of a number of real‐world datasets. The underrepresentation of the minority class in such datasets causes difficulty in learning and prediction of its members. In this era of Big Data, we need to address this practical issue in order to get admissible outputs from our classifier models. We present a novel work, oversampling the minority class using a dedicated fitness function and genetic algorithmic progression – to tackle the problem of class imbalance. We generate a random set of feature points in the given space and use genetic‐algorithm guided progression to metamorphose them into a set of potential minority points. The optimization function looks at the neighborhood of the synthetic minority points for computing the fitness scores. We have employed 19 real‐world datasets, three metrics, three classifiers, and four diversified oversampling methods in the empirical study. A comprehensive study involving several well‐established minority oversamplers demonstrates the efficaciousness of the proposed method in handling class imbalance.

Full Text