Abstract

Reinforcement learning refers to a class of learning tasks and algorithms based on experimented psychology’s principle of reinforcement. Recent research uses the framework of stochastic optimal control to model problems in which a learning agent has to incrementally approximate an optimal control rule, or policy, often starting with incomplete information about the dynamics of its environment. Although these problems have been studied intensively for many years, the methods being developed by reinforcement learning researchers are adding some novel elements to classical dynamic programming solution methods. This article provides a brief account of these methods, explains what is novel about them, and suggests what their advantages might be over classical applications of dynamic programming to large-scale stochastic optimal control problems.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.