Reinforcement learning under temporal logic constraints as a sequence modeling problem

Daiying Tian,Hao Fang,Qingkai Yang,Haoyong Yu,Wenyu Liang,Yan Wu

doi:10.1016/j.robot.2022.104351

Abstract

Reinforcement learning (RL) under temporal logic typically suffers from slow propagation for credit assignment. Inspired by recent advancements called trajectory transformer in machine learning, the reinforcement learning under Temporal Logic (TL) is modeled as a sequence modeling problem in this paper, where an agent utilizes the transformer to fit the optimal policy satisfying the Finite Linear Temporal Logic (LTLf) tasks. To combat the sparse reward issue, dense reward functions for LTLf are designed. For the sake of reducing the computational complexity, a sparse transformer with local and global attention is constructed to automatically conduct credit assignment, which removes the time-consuming value iteration process. The optimal action is found by the beam search performed in transformers. The proposed method generates a series of policies fitted by sparse transformers, which has sustainably high accuracy in fitting the demonstrations. At last, the effectiveness of the proposed method is demonstrated by simulations in Mini-Grid environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reinforcement learning under temporal logic constraints as a sequence modeling problem

Abstract

Talk to us

Similar Papers

More From: Robotics and Autonomous Systems

Lead the way for us

Journal: Robotics and Autonomous Systems	Publication Date: Dec 28, 2022
Citations: 2

Similar Papers

Control Barrier Functions for Abstraction-Free Control Synthesis under Temporal Logic Constraints
Luyao Niu ... Andrew Clark
-
Luyao Niu, et. al.Luyao Niu ... Andrew Clark
14 Dec 2020
14 Dec 2020

Coupled Multi-Robot Systems Under Linear Temporal Logic and Signal Temporal Logic Tasks
Lars Lindemann ... Meng Guo
IEEE Transactions on Control Systems Technology | VOL. 29
Lars Lindemann, et. al.Lars Lindemann ... Meng Guo
03 Jan 2020
IEEE Transactions on Control Systems Technology | VOL. 29

Sampling-based approximate optimal temporal logic planning
Lening Li ... Jie Fu
-
Lening Li, et. al.Lening Li ... Jie Fu
01 May 2017
01 May 2017

Localization of a Ground Robot by Aerial Robots for GPS-Deprived Control with Temporal Logic Constraints
Eric Cristofalo ... Mac Schwager
-
Eric Cristofalo, et. al.Eric Cristofalo ... Mac Schwager
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reinforcement learning under temporal logic constraints as a sequence modeling problem

Abstract

Talk to us

Similar Papers

More From: Robotics and Autonomous Systems