Abstract

A learning automaton (LA) is a powerful tool for reinforcement learning. Its action probability vector plays two roles: 1) deciding when it converges, i.e., total computing budget it has used, and 2) allocating computing budget among actions to identify the optimal one. These two intertwined roles lead to a problem: the computing budget mostly goes to the currently estimated optimal action due to its high action probability regardless whether such budget allocation can help identify the true optimal one or not. This work proposes a new class of LA that avoids the use of its action probability vector for computing budget allocation. Instead we use such vector only to determine if it converges and then employ optimal computing budget allocation to accomplish the allocation of computing budget in a way that maximizes the probability of identifying the true optimal actions. ε-optimality is proven. Simulations verify its advantages over existing algorithms. A learning automaton (LA) represents an important leaning mechanism with applications in automated system design, biological systems, computer vision, and transportation. It updates its action probability vector in accordance with the inputs received from the environment to improve its performance. It acts as an adaptive controller in modeling a process as well as generating appropriate control signals. The existing LAs simply employ heuristics to update their action probability vectors and then use the vectors for ordinal optimization and determining the computing budget size. This work separates ordinal optimization from the action probability vector and introduces optimal computing budget allocation to maximize the probability of selecting the true optimal action. Compared with the state-of-the-art methods in five popular environments, the proposed LA speeds up the learning efficiency ranging from 10.93% to 65.94%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call