Abstract

We propose a reinforcement learning scheme for random neural networks (RNN) that update network weights based on the relative reward obtained by past actions. This new learning scheme achieves improved performance in random environments under certain conditions, in particular, when the number of actions is small. We also explore hybrid learning structures, where a RNN functions as a meta decision unit that learns how to best delegate the action decision to a second layer of decision units. An extensive simulation study demonstrates the performance of these learning methods applied to minimizing response times in a distributed task assignment scenario. Results are compared to the cognitive packet networks’ reinforcement learning scheme, a reward-only variation, and several non-learning methods. The study reveals that no single method can produce the best performance results in all cases. However, both the generalized reinforcement learning scheme and the application of meta decision units can help to achieve improved performance in specific cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call