Abstract

Genetic programming (GP) has been considerably used for image classification because of its ability to learn simple and effective models. However, most GP methods require a large amount of training data to learn informative features for classification, where the generalization performance might be poor when only a few training instances are available. In addition to using classification accuracy to assess the goodness of GP individuals/solutions like in most GP methods, this paper proposes a new fitness function containing distance measures. The proposed method uses different distance measures to deal with binary and multi-class classification automatically. By simultaneously minimizing the within-class distance and maximizing the between-class distance, the generalization performance can be improved. Furthermore, existing GP methods typically employ standard crossover to search for the best individuals from the whole search space. However, these methods might not completely exploit the potential local search space. Based on the niching technique, this paper develops a new crossover operator, which enables better exploitation of the global and local search space, improving learning effectiveness and classification accuracy. The new approach achieves significantly better generalization performance than almost all benchmark methods on eight datasets and is also computationally efficient. Further analysis demonstrates the significance of the new fitness function and crossover operator and shows the potentially good interpretability of the learned models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call