Abstract

This paper proposes an adaptive modular reinforcement learning architecture and an algorithm for robot control operating in multiple environments. Reinforcement learning autonomously acquires control rules by interacting between the agent and the controlled system. Consequently, reinforcement learning is expected to be applied to robot control where model identification is difficult. These robots are often expected to operate in multiple environments. However, existing reinforcement learning algorithms require prior knowledge of changes in the environment. In this paper, we proposed an architecture and algorithm that does not require prior knowledge of the environment. In this architecture, the policy can be acquired by increasing the number of modules based on the interaction with the controlled system. Therefore, the proposed method can be applied to robots whose dynamics change without losing the feature that the reinforcement learning algorithm does not require prior knowledge of the controlled system. Two numerical experiments were conducted to evaluate the proposed method, which improved the performance by approximately 25 % when compared to the conventional methods.

Highlights

  • Reinforcement learning is a framework for machine learning that optimizes the operation of a system by solving multistep decision-making problems [1]

  • Reinforcement learning applicable to robot control, where the state transition is deterministic but the dynamics of the controlled systemis complex, and model identification is difficult

  • ADAPTIVE MODULAR LINEAR QUADRATIC TRACKING CONTROLLER we propose an adaptive modular linear quadratic tracker (AMLQT), which is an implementation of adaptive modular reinforcement learning architecture for solving problems formulated by III

Read more

Summary

Introduction

Reinforcement learning is a framework for machine learning that optimizes the operation of a system by solving multistep decision-making (action decision) problems [1]. Reinforcement learning consists of an agent (controller) and a controlled system. The agent receives states obtained from the controlled systemand the reward that indicate the control performance. The agent chooses the action (control input) that maximizes the accumulated reward by trial and error. Reinforcement learning has the feature that the policy can be acquired autonomously through the interaction between the agent and the controlled system. Because of this feature, reinforcement learning applicable to robot control, where the state transition is deterministic but the dynamics of the controlled systemis complex, and model identification is difficult. Reinforcement learning has been applied to various robot control tasks [2]–[6] such as reconfigurable robots [7]–[9] and trajectory control for manipulators [10]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call