Model-Based Offline Reinforcement Learning with Local Misspecification

Kefan Dong,Yannis Flet-Berliac,Allen Nie,Emma Brunskill

doi:10.1609/aaai.v37i6.25903

Model-Based Offline Reinforcement Learning with Local Misspecification

Kefan Dong, Yannis Flet-Berliac + Show 2 more

Open Access

https://doi.org/10.1609/aaai.v37i6.25903

Copy DOI

Journal: Proceedings of the AAAI Conference on Artificial Intelligence

Publication Date: Jun 26, 2023

Affiliation: Stanford University

#Local Misspecification #Distribution Mismatch + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We present a model-based offline reinforcement learning policy performance lower bound that explicitly captures dynamics model misspecification and distribution mismatch and we propose an empirical algorithm for optimal offline policy selection. Theoretically, we prove a novel safe policy improvement theorem by establishing pessimism approximations to the value function. Our key insight is to jointly consider selecting over dynamics models and policies: as long as a dynamics model can accurately represent the dynamics of the state-action pairs visited by a given policy, it is possible to approximate the value of that particular policy. We analyze our lower bound in the LQR setting and also show competitive performance to previous lower bounds on policy selection across a set of D4RL tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.