Abstract
In this study, we verified the performance of the swing-up control method for a reaction wheel pendulum using the actor-critic algorithm in both simulation and experiment and suggested the possibility that reinforcement learning, using shallow neural networks, can be applied to studying intelligent robots that act in real-world environments, such as a robot that teaches itself to walk through trial and error. The actor of the proposed actor-critic algorithm used the policy network to determine the rotational direction of the reaction wheel based on the angular position and velocity of the pendulum and the angular velocity of the reaction wheel. The critic used the value network to estimate the expected reward based on the same factors as the actor’s. In both simulation and in the real-world environment, through trial and error, the proposed algorithm successfully learned how to swing up and stabilize the pendulum by choosing the rotational direction ‒ between the clockwise and counter-clockwise directions ‒ of the reaction wheel.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Institute of Control, Robotics and Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.