A study of fitness functions for data classification using grammatical evolution

Tatenda Chareka,Nelishia Pillay

doi:10.1109/robomech.2016.7813165

Abstract

Data classification is a well studied area with various techniques such as support vector machines, decision trees, neural networks and evolutionary algorithms, amongst others successfully applied to this domain. The research presented in this paper forms part of an initiative aimed at evaluating grammatical evolution, a recent variation of genetic programming, for data classification. The paper reports on a study conducted to compare six different measures, namely, accuracy, true positive rate, false positive rate, precision, F-score and Matthew's correlation coefficient, as fitness functions for grammatical evolution. The performance of grammatical evolution using the six measures as a fitness function is evaluated for multi-class data classification. The study has shown that the accuracy and F-score are effective as fitness functions outperforming all other measures. In some instances accuracy produced better results than F-score. Future work will examine the correlation between the characteristics of the data set and the best performing measure.

Full Text