Abstract

Learning accurate state transition dynamics model in a sample-efficient way is important to predict the future states from the current states and actions of a system both accurately and efficiently in model-based reinforcement learning for many robotic applications. This study proposes a sample-efficient learning approach that can accurately learn a state transition dynamics model by fitting both the predicted next states and their derivatives. The derivatives of the feedforward neural network output (next states) with respect to the inputs (current states and actions) are computed using chain rules. In addition, the effect of the activation functions on the learning derivatives are illustrated via sum of elementary sine functions example and the values are compared with various other activation functions with respect to accuracy. The proposed learning approach exhibits significant improvement in accuracy for both one-step and multi-step prediction cases with a six-degree-of-freedom manipulation robot (UR-10) in both simulation and real environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.