Abstract
This paper is concerned with the autonomous learning of plans in probabilistic domains with out a priori domain-specific knowledge. In contrast to existing reinforcement learning algorithms that generate only reactive plans, and existing probabilistic planning algorithms that require a sub stantial amount of a priori knowledge in order to plan, a two-stage bottom-up process is devised in which first reinforcement learning/dynamic programming is applied, without the use of a pri ori domain-specific knowledge, to acquire a reactive plan, and then explicit plans are extracted from the reactive plan. Several options for plan extraction are examined, each of which is based on a beam search that performs temporal projection in a restricted fashion, guided by the value functions resulting from reinforcement learning/dynamic programming. Some completeness and soundness results are given. Examples in several domains are discussed that together demonstrate the working of the proposed model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.