Abstract

Interactive Fiction (IF) games with real human-written natural language texts provide a new natural evaluation for language understanding techniques. In contrast to previous text games with mostly synthetic texts, IF games pose language understanding challenges on the human-written textual descriptions of diverse and sophisticated game worlds and language generation challenges on the action command generation from less restricted combinatorial space. We take a novel perspective of IF game solving and re-formulate it as Multi-Passage Reading Comprehension (MPRC) tasks. Our approaches utilize the context-query attention mechanisms and the structured prediction in MPRC to efficiently generate and evaluate action outputs and apply an object-centric historical observation retrieval strategy to mitigate the partial observability of the textual observations. Extensive experiments on the recent IF benchmark (Jericho) demonstrate clear advantages of our approaches achieving high winning rates and low data requirements compared to all previous approaches.

Highlights

  • Interactive systems capable of understanding natural language and responding in the form of natural language text have high potentials in various applications

  • In pursuit of building and evaluating such systems, we study learning agents for Interactive Fiction (IF) games

  • IF games composed of human-written texts create superb new opportunities for studying and evaluating natural language understanding (NLU) techniques due to their unique characteristics

Read more

Summary

Introduction

Interactive systems capable of understanding natural language and responding in the form of natural language text have high potentials in various applications. To make RL agents learn efficiently without prohibitive exhaustive trials, the action estimation must generalize learned knowledge from tried actions to others To this end, previous approaches, starting with a single embedding vector of the observation, either predict the elements of actions independently (Narasimhan et al, 2015; Hausknecht et al, 2019a); or embed each valid action as another vector and predict action value based on the vector-space similarities (He et al, 2016). The latest observation is often not a sufficient summary of the interaction history and may not provide enough information to determine the long-term effects of actions Previous approaches address this problem by building a representation over past observations (e.g., building a graph of objects, positions, and spatial relations) (Ammanabrolu and Riedl, 2019; Ammanabrolu and Hausknecht, 2020). We provided ablation studies on our models and retrieval strategies

Related Work
Problem Formulation
RC Model for Template Actions
Multi-Paragraph Retrieval Method for Partial Observability
Training Loss
Experiments
Overall Performance
Ablative Studies
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call