GA-based reinforcement learning for neural networks

Chin-Teng Lin,Chong-Ping Jou,Cheng-Jiang Lin

doi:10.1080/00207729808929517

Abstract

A genetic reinforcement neural network (GRNN) is proposed to solve various reinforcement learning problems. The proposed GRNN is constructed by integrating two feedforward multilayer networks. One neural network acts as an action network for determining the outputs (actions) of the GRNN, and the other as a critic network to help the learning of the action network. Using the temporal difference prediction method, the critic network can predict the external reinforcement signal and provide a more informative internal reinforcement signal to the action network. The action network uses the genetic algorithm (GA) to adapt itself according to the internal reinforcement signal. The key concept of the proposed GRNN learning scheme is to formulate the internal reinforcement signal as the fitness function for the GA. This learning scheme forms a novel hybrid GA, which consists of the temporal difference and gradient descent methods for the critic network learning, and the GA for the action network learning. By using the internal reinforcement signal as the fitness function, the GA can evaluate the candidate solutions (chromosomes) regularly, even during the period without external reinforcement feedback from the environment. Hence, the GA can proceed to new generations regularly without waiting for the arrival of the external reinforcement signal. This can usually accelerate the GA learning because a reinforcement signal may only be available at a lime long after a sequence of actions has occurred in the reinforcement learning problems. Computer simulations have been conducted to illustrate the performance and applicability of the proposed learning scheme.

Full Text