Abstract

Generalization is one of the most important performance evaluation criteria for artificial learning systems, in particular for supervised learning. While a large amount of literature and of well established results exist concerning the issue of generalization for many non-evolutionary Machine Learning strategies, like for instance Support Vector Machines, this issue in Genetic Programming (GP) has not received the attention it deserves and only recently, few papers dealing with the problem of generalization have appeared (see for instance [1, 2, 3]). In this paper, we have motivated and empirically shown that GP using a Pareto multi-optimization on the training set has a remarkably higher generalization ability than canonic or standard GP (besides counteracting bloat in a more efficient way and maintaining a higher diversity inside the population). Here is an informal motivation for this idea: in figure 1, we have plotted two simple hypothetical fitness functions and two simple hypothetical GP individuals with good fitness on the training set and bad generalization ability, if the sum of errors is considered as the sole evaluation criterium. Even though for points inside the training set the gray and black curves are very close (and thus fitness is good on the training set, if fitness is the sum of errors), outside the training set, they are very far from each other and they get farthest as we consider farthest points from the training set. This happens because the gray and black curves are uncorrelated and all the distances between the gray curves points and the black curves ones with the same abscissa inside the training set are different between each other. Thus, three optimization criteria have been used on the training set by our multi-optimization framework: sum of errors, statistical correlation between targets and outputs and variance of the pairwise distances between targets and outputs. Simulations have been executed on three

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call