A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs

Mateo Perez,Fabio Somenzi,Ashutosh Trivedi

doi:10.1609/aaai.v38i19.30148

A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs

Mateo Perez, Fabio Somenzi + Show 1 more

Open Access

https://doi.org/10.1609/aaai.v38i19.30148

Copy DOI

Journal: Proceedings of the AAAI Conference on Artificial Intelligence

Publication Date: Mar 24, 2024

#Linear Temporal Logic #Markov Decision Processes + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Linear temporal logic (LTL) and omega-regular objectives---a superset of LTL---have seen recent use as a way to express non-Markovian objectives in reinforcement learning. We introduce a model-based probably approximately correct (PAC) learning algorithm for omega-regular objectives in Markov decision processes (MDPs). As part of the development of our algorithm, we introduce the epsilon-recurrence time: a measure of the speed at which a policy converges to the satisfaction of the omega-regular objective in the limit. We prove that our algorithm only requires a polynomial number of samples in the relevant parameters, and perform experiments which confirm our theory.

Full Text