Abstract

A new optimal local path planning method for the mobile robot is introduced. It is achieved through the optimal control rules which are formed automatically based on the Q-Learning (QL) method of the Reinforcement Learning (RL). The rules are executed as the reaction behaviors on the mobile robot at last. The state/action space which construct the rules structure are discretized according to the fuzzy logic. A Lookup_Q matrix M <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">circQcirc</sub> is built to store Qcirc values of each state-action pair <s,a>. According to the Boltzman Equation, an action of all available actions is chosen at the same state. The reinforcement signal is studied carefully with a non-uniform manner. All the <s,a> pairs, which have the maximum circQ value in each column are selected out after QL. Then the optimal control rules are formed based on them after the merger. The algorithm can automatically control the formation of the rules and amends them expediently. At last the method performance is tested in different environments under the control of the rules.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call