Abstract

Universal search can serve as an asymptotically optimal agent for machine inversion and time-limited optimization problems. The optimality is independent of problem size, but search space has an exponential dependency on solution size. Reinforcement learning with gradient ascent can dampen this search space. However, in many scenarios in large state spaces, the gradient information becomes nonexistent for a long time which slows down learning. Genetic programming merged with universal search is proposed and build a reinforcement learning agent to alleviate this problem. The universal search is implemented using a functional dataflow graph-based programming model with equivalent program pruning and gradient ascent based incremental learning. The genetic programming naturally fits into the universal search with implicit crossover and mutation operators and without any need of problem-specific population initialization. The agent is experimented on two problem environments and outperformed state of the art method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call