Improving Neural Story Generation by Targeted Common Sense Grounding

Huanru Henry Mao,Garrison Cottrell,Julian Mcauley,Bodhisattwa Prasad Majumder

doi:10.18653/v1/d19-1615

Abstract

Stories generated with neural language models have shown promise in grammatical and stylistic consistency. However, the generated stories are still lacking in common sense reasoning, e.g., they often contain sentences deprived of world knowledge. We propose a simple multi-task learning scheme to achieve quantitatively better common sense reasoning in language models by leveraging auxiliary training signals from datasets designed to provide common sense grounding. When combined with our two-stage fine-tuning pipeline, our method achieves improved common sense reasoning and state-of-the-art perplexity on the WritingPrompts (Fan et al., 2018) story generation dataset.

Highlights

Story generation is the task of automatically producing compelling creative writing
We propose evaluating the common sense of a model automatically by ranking the model’s perplexity on spurious text completions from SWAG (Zellers et al, 2018) and Story Cloze (Mostafazadeh et al, 2016) datasets, which are designed for common sense grounding
Our work builds upon Multi-task learning (MTL) principles as we introduce auxiliary tasks to tackle common sense reasoning (CSR)

Summary

Introduction

Recent advances in language modeling have yielded thematic and stylistic coherence in story generation through large scale pretraining of Transformer models (Vaswani et al, 2017). The recent introduction of the General Pre-trained Transformer v2. Transformer trained on a large, diverse corpus of text crawled from the web (called WebText)—is capable of generating stylistically coherent text but commonly produces text with logical inconsistencies. In one sample the model writes: “It was a sunny, warm summer night”. This writing is nonsense as it cannot be sunny at night. The lack of common sense reasoning in such a strong language model suggests that minimizing next-token perplexity alone may be insufficient in producing models that can compose sensible stories

Objectives

Results

Conclusion