Abstract

Finding a balance between diversity and convergence plays an important role in evolutionary algorithms to avoid premature convergence and to perform a better exploration of the search space. In the case of Genetic Programming, and more specifically for symbolic regression problems, different mechanisms have been devised to control diversity, ranging from novel crossover and/or mutation procedures to the design of distance measures that help genetic operators to increase diversity in the population. In this paper, we start from previous works where Straight Line Programs are used as an alternative representation to expression trees for symbolic regression, and develop a similarity measure based on edit distance in order to determine how different the Straight Line Programs in the population are. This measure is used in combination with the CHC algorithm strategy to control diversity in the population, and therefore to avoid local optima to solve symbolic regression problems. The proposal is first validated in a controlled scenario of benchmark datasets and it is compared with previous approaches to promote diversity in Genetic Programming. After that, the approach is also evaluated in a real world dataset of energy consumption data from a set of buildings of the University of Granada.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call