Abstract
Advances in reinforcement learning (RL) algorithms have made them become increasingly capable in many tasks in recent years.However, the vastmajority of RL algorithms are not readily interpretable off the shelf. Moreover, the task of generating explanations in the form of human language has not been sufficiently addressed for these RL algorithms in previous works. Human language explanations have the advantages of being easy to understand, and they can help increase the satisfaction experienced by the end user while using the product. In this paper, we propose a method for generating explanations in the form of free-text human language to help the end user better understand the behaviors of RL agents. Our work involves generating explanations for both single actions and sequences of actions. We also create an open dataset as a baseline for future research. Our proposed method is evaluated in two simulated environments: Pong and the Minimalistic Gridworld Environment (MiniGgrid). The results demonstrate that our models are able to consistently generate accurate rationales, which are highly correlated with the expert rationales. Hence, this work offers a solution for bridging the gap of trust encountered when employing RL agents in virtual world or real-world applications.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.