Verbal Explanations for Deep Reinforcement Learning Neural Networks with Attention on Extracted Features

Xinzhi Wang,Shengcheng Yuan,Katia Sycara,Michael Lewis,Hui Zhang

doi:10.1109/ro-man46459.2019.8956301

Abstract

In recent years, there has been increasing interest in transparency in Deep Neural Networks. Most of the works on transparency have been done for image classification. In this paper, we report on work of transparency in Deep Reinforcement Learning Networks (DRLNs). Such networks have been extremely successful in learning action control in Atari games. In this paper, we focus on generating verbal (natural language) descriptions and explanations of deep reinforcement learning policies. Successful generation of verbal explanations would allow better understanding by people (e.g., users, debuggers) of the inner workings of DRLNs which could ultimately increase trust in these systems. We present a generation model which consists of three parts: an encoder on feature extraction, an attention structure on selecting features from the output of the encoder, and a decoder on generating the explanation in natural language. Four variants of the attention structure full attention, global attention, adaptive attention and object attention - are designed and compared. The adaptive attention structure performs the best among all the variants, even though the object attention structure is given additional information on object locations. Additionally, our experiment results showed that the proposed encoder outperforms two baseline encoders (Resnet and VGG) on the capability of distinguishing the game state images.

Full Text