Abstract

AbstractThis paper proposes a reinforcement learning scheme using multiple prediction models (multiple model‐based reinforcement learning, MMRL). MMRL prepares multiple pairs, consisting of the prediction model used to predict the future state of the control object and the reinforcement learning controller used to learn the control output. Using a soft‐max function of the prediction error of each prediction model, the “responsibility signal” is calculated, which takes a larger value for the module with a more accurate prediction. By weighting the learning and the control output of each module by means of the responsibility signal, modules to deal with various situations are formed. In order to achieve a robust modular structure of MMRL without a priori knowledge, such as the number of modules and the region to be covered, a prior responsibility signal is formulated, assuming spatial and temporal continuity. As a method for efficient implementation of MMRL, an optimal controller (MLQC) based on multiple linear prediction and quadratic reward models is formulated. In order to verify the performance of MLQC, a simulation was performed on the swing‐up of a single pendulum. It was shown that the linear prediction model and the corresponding controller were acquired by learning for the range near the suspended point and upright point of the single pendulum. The task can be learned in a shorter time than in the conventional method, and it is possible to handle redundancy of modules. © 2006 Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 89(9): 54–69, 2006; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjc.20266

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.