Learning Rewards From Linguistic Feedback

Theodore R Sumers,Karthik Narasimhan,Thomas L Griffiths,Mark K Ho,Robert D Hawkins

doi:10.1609/aaai.v35i7.16749

Abstract

We explore unconstrained natural language feedback as a learning signal for artificial agents. Humans use rich and varied language to teach, yet most prior work on interactive learning from language assumes a particular form of input (e.g., commands). We propose a general framework which does not make this assumption, instead using aspect-based sentiment analysis to decompose feedback into sentiment over the features of a Markov decision process. We then infer the teacher's reward function by regressing the sentiment on the features, an analogue of inverse reinforcement learning. To evaluate our approach, we first collect a corpus of teaching behavior in a cooperative task where both teacher and learner are human. We implement three artificial learners: sentiment-based "literal" and "pragmatic" models, and an inference network trained end-to-end to predict rewards. We then re-run our initial experiment, pairing human teachers with these artificial learners. All three models successfully learn from interactive human feedback. The inference network approaches the performance of the "literal" sentiment model, while the "pragmatic" model nears human performance. Our work provides insight into the information structure of naturalistic linguistic feedback as well as methods to leverage it for reinforcement learning.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Rewards From Linguistic Feedback

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: May 18, 2021
Citations: 3

Similar Papers

Deep reinforcement learning for cooperative robots based on adaptive sentiment feedback
Haein Jeon ... Bo-Yeong Kang
Expert systems with applications | VOL. 243
Haein Jeon, et. al.Haein Jeon ... Bo-Yeong Kang
18 Aug 2023
Expert systems with applications | VOL. 243

Training Agents With Interactive Reinforcement Learning and Contextual Affordances
Francisco Cruz ... Cornelius Weber
IEEE transactions on autonomous mental development | VOL. 8
Francisco Cruz, et. al.Francisco Cruz ... Cornelius Weber
01 Dec 2016
IEEE transactions on autonomous mental development | VOL. 8

시계열 자료의 예측을 위한 베이지안 순환 신경망에 관한 연구
...
Journal of Control, Automation and Systems Engineering | VOL. 10
, et. al. ...
01 Dec 2004
Journal of Control, Automation and Systems Engineering | VOL. 10

Improving reinforcement learning with interactive feedback and affordances
Francisco Cruz ... Sven Magg
-
Francisco Cruz, et. al.Francisco Cruz ... Sven Magg
01 Oct 2014
01 Oct 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Rewards From Linguistic Feedback

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence