Abstract

Continuous action control is widespread in real-world applications. It controls an agent to take action in continuous space for transiting from one state to another until achieving the desired goal. The optimization of continuous action control is an important issue, which aims to find the optimal policy for the agent to achieve the desired goal with the lowest consumption in continuous action space. A useful tool for this issue is reinforcement learning where an optimal policy is learned for the agent by maximizing the cumulative reward of the state transitions. When updating the policy at each state, most existing reinforcement learning methods consider only the one-step transition of this state. However, for each state in continuous action control, the recognizable information is usually hidden in the sequence of its previous states, thus these methods cannot learn the policy effectively enough for continuous action control. In this paper, we propose a new policy, called convolutional deterministic policy, to solve this problem. Enlightened from the convolutional neural networks used in natural language processing, our convolutional deterministic policy uses convolutional neural networks to learn the recognizable information in the state sequences. Then for each collected state, we update the convolutional deterministic policy by not only the recognizable information in the one-step transition of this state but also the recognizable information in the sequence of its previous states. As a result, our convolutional deterministic policy can make the agent take better action. Based on an effective reinforcement learning method, TD3, the implementation of our convolutional deterministic policy is in CTD3. The theoretical analysis and the experiment illustrate that our CTD3 can learn the policy not only better than but also faster than the existing RL methods for continuous action control. The source code can be downloaded from https://github.com/grcai.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.