Abstract
This paper studies e-greedy algorithm and softmax algorithm in obstacle avoidance and balance study. In the experiment, Sarsa algorithm and Q-Learning algorithm were used to appropriately simplify and build the model of obstacle avoidance; softmax algorithm was used to address how to balance exploration and utilisation; and two classical algorithms of reinforcement learning were adopted to deal with obstacle avoidance. The results generated by simulation prove that Sarsa algorithm and QLearning algorithm can handle obstacle avoidance and balance study in limited time step, which makes the intelligent agent improve the non-maximum estimated value of the value function of the state so as to choose the best action that has been carried out. In addition, Sarsa algorithm and Q-Learning algorithm can also enable the intelligent agent to try new actions and find out the optimal one.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have