Multi-objective evolutionary optimization using the relationship between F1 and accuracy metrics in classification tasks

Juan Carlos Fernández,Pedro Antonio Gutiérrez,César Hervás-Martínez,Mariano Carbonero

doi:10.1007/s10489-019-01447-y

Juan Carlos Fernández, Pedro Antonio Gutiérrez + Show 2 more

https://doi.org/10.1007/s10489-019-01447-y

Copy DOI

Abstract

This work analyses the complementarity and contrast between two metrics commonly used for evaluating the quality of a binary classifier: the correct classification rate or accuracy, C, and the F1 metric, which is very popular when dealing with imbalanced datasets. Based on this analysis, a set of constraints relating C and F1 are defined as a function of the ratio of positive patterns in the dataset. We evaluate the possibility of using a multi-objective evolutionary algorithm guided by this pair of metrics to optimise binary classification models. To check the validity of the constraints, we perform an empirical analysis considering 26 benchmark datasets obtained from the UCI repository and an interesting liver transplant dataset. The results show that the relation is fulfilled and that the use of the algorithm for simultaneously optimising the pair (C,F1) leads to a generally balanced accuracy for both classes. The experiments also reveal that, in some cases, better results are obtained by using the majority class as the positive class instead of using the minority one, which is the most common approach with imbalanced datasets.

Full Text