A Reinforcement Learning Agent based on Genetic Programming and Universal Search

Swarna Kamal Paul,Parama Bhaumik

doi:10.1109/iciccs48265.2020.9121014

Abstract

Universal search can serve as an asymptotically optimal agent for machine inversion and time-limited optimization problems. The optimality is independent of problem size, but search space has an exponential dependency on solution size. Reinforcement learning with gradient ascent can dampen this search space. However, in many scenarios in large state spaces, the gradient information becomes nonexistent for a long time which slows down learning. Genetic programming merged with universal search is proposed and build a reinforcement learning agent to alleviate this problem. The universal search is implemented using a functional dataflow graph-based programming model with equivalent program pruning and gradient ascent based incremental learning. The genetic programming naturally fits into the universal search with implicit crossover and mutation operators and without any need of problem-specific population initialization. The agent is experimented on two problem environments and outperformed state of the art method.

Full Text