Abstract

In this current research work, we applied a Twin- Delayed DDPG (TD3) algorithm to solve the most challenging virtual Artificial Intelligence application by training a HalfCheetah robot as an Intelligent Agent to run across a field. Twin-Delayed DDPG (TD3) is a recent breakthrough smart AI model of a Deep Reinforcement Learning which combines the state-of-the-art techniques in Artificial Intelligence, including continuous Double Deep Q-Learning, Policy Gradient and Actor-Critic. These Deep Reinforcement Learning approaches have the capabilities to train an Intelligent agent to interact with an environment with automatic feature engineering, that is, requiring minimal domain knowledge. Twin-Delayed Deep Deterministic Policy Gradient algorithm (TD3) was built on the Deep Deterministic Policy Gradient algorithm (DDPG). During the implementation of the TD3 model, we used a two- layer feedforward neural network of 400 and 300 hidden nodes respectively, with Rectified Linear Units (ReLU) as an activation function between each layer for both the Actor and Critics, and then a final tanh unit following the output of the Actor. Overall, we developed six (6) neural networks. The Critic received both the state and action as input to the first layer. Both the network parameters were updated using the Adam optimizer. The implementation of the TD3 algorithm was made possible by using the pybullet continuous control environment which was interfaced through the OpenAI Gym. The idea behind the Twin-Delayed DDPG (TD3) is to reduce overestimation bias in Deep Q-Learning with discrete actions which are ineffective in an Actor-Critic domain setting. After exposing the Agent to training for 500,000 iterations, the Agent then achieved a Maximum Average Reward over the evaluation time-step of approximately 1891. Twin-Delayed Deep Deterministic Policy Gradient (TD3) has prominently improved both the learning speed and performance of the DDPG in a challenging task in a continuous control setting.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.