Abstract

A first step towards analyzing runtime complexity of an evolutionary algorithm adaptively adjusted using reinforcement learning is made. We analyze the previously proposed EA + RL method that enhances single-objective optimization by selecting efficient auxiliary fitness functions. Precisely, Random Mutation Hill Climber adjusted with Q-learning using greedy exploration strategy is considered. We obtain both lower and upper bound for the number of fitness function evaluations needed for this EA + RL implementation to solve a modified OneMax problem. It turns out that EA + RL with an inefficient auxiliary fitness function performs on par with a conventional evolutionary algorithm, namely in Θ(N log N) fitness function evaluations, where N is the size of the OneMax problem. In other words, we show that reinforcement learning successfully ignores inefficient fitness function. A lower bound for the ε-greedy exploration strategy for ε > 0 is analyzed as well.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call