Abstract

In view of the low classification accuracy of the minority class in imbalanced data, an algorithm called DPF-EL (Density Peaks and Fitness Ensemble Learning) based on density peaks clustering and fitness is proposed. Firstly, this method uses the density peaks clustering algorithm to divide the majority class into different sub-clusters, the local density calculated in the clustering process is used to assign weights to each sub-cluster, and the number of under-samplings is determined by the weights. Secondly, the concept of fitness is introduced into the sub-clusters. The selection probability of the samples is calculated according to the size of their fitness, and the majority class is under-sampled based on the selection probability. Finally, combined with boosting algorithm, iterative training is performed on the balanced data set. Experimental tests were conducted with KEEL imbalanced data sets, and the experimental results show that the performance of DPF-EL algorithm is better than other algorithms, which indicates the feasibility of the proposed algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call