A stochastic reinforcement learning algorithm for learning real-valued functions

Vijaykumar Gullapalli

doi:10.1016/0893-6080(90)90056-q

Abstract

Most of the research in reinforcement learning has been on problems with discrete action spaces. However, many control problems require the application of continuous control signals. In this paper, we present a stochastic reinforcement learning algorithm for learning functions with continuous outputs using a connectionist network. We define stochastic units that compute their real-valued outputs as a function of random activations generated using the normal distribution. Learning takes place by using our algorithm to adjust the two parameters of the normal distribution so as to increase the probability of producing the optimal real value for each input pattern. The performance of the algorithm is studied by using it to learn tasks of varying levels of difficulty. Further, as an example of a potential application, we present a network incorporating these stochastic real-valued units that learns to perform an underconstrained positioning task using a simulated 3 degree-of-freedom robot arm.

Full Text