Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics

Pablo Samuel Castro,Doina Precup

doi:10.1007/978-3-642-29946-9_16

Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics

Pablo Samuel Castro, Doina Precup

PDF Available

https://doi.org/10.1007/978-3-642-29946-9_16

Copy DOI

Export

Save

Cite

Publication Date: Jan 1, 2012

Citations: 9

Affiliation: McGill University

#Bisimulation Metrics #Markov Decision Process #Large Markov Decision Process #Finite Markov Decision Process #Automatic Construction #Reinforcement Learning #Speed Of Learning #Parameter Tuning #Decision Process

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Temporally extended actions are usually effective in speeding up reinforcement learning. In this paper we present a mechanism for automatically constructing such actions, expressed as options [24], in a finite Markov Decision Process (MDP). To do this, we compute a bisimulation metric [7] between the states in a small MDP and the states in a large MDP, which we want to solve. The shape of this metric is then used to completely define a set of options for the large MDP. We demonstrate empirically that our approach is able to improve the speed of reinforcement learning, and is generally not sensitive to parameter tuning.

Full Text

Paper version not known (Free)

View/Download pdf

Published Version

Check institute access

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.

R Discovery Prime

Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics