A Spatial Model for Extracting and Visualizing Latent Discourse Structure in Text

Shashank Srivastava,Nebojsa Jojic

doi:10.18653/v1/p18-1211

Abstract

We present a generative probabilistic model of documents as sequences of sentences, and show that inference in it can lead to extraction of long-range latent discourse structure from a collection of documents. The approach is based on embedding sequences of sentences from longer texts into a 2- or 3-D spatial grids, in which one or two coordinates model smooth topic transitions, while the third captures the sequential nature of the modeled text. A significant advantage of our approach is that the learned models are naturally visualizable and interpretable, as semantic similarity and sequential structure are modeled along orthogonal directions in the grid. We show that the method is effective in capturing discourse structures in narrative text across multiple genres, including biographies, stories, and newswire reports. In particular, our method outperforms or is competitive with state-of-the-art generative approaches on tasks such as predicting the outcome of a story, and sentence ordering.

Highlights

The ability to identify discourse patterns and narrative themes from language is useful in a wide range of applications and data analysis
∗*Work done while first author was an intern at Microsoft Research in a narrative (Mostafazadeh et al, 2016), or reason about which narratives are coherent and which do not make sense (Barzilay and Lapata, 2008)
Knowledge of discourse is increasingly important for language generation models

Summary

Introduction

The ability to identify discourse patterns and narrative themes from language is useful in a wide range of applications and data analysis. From a perspective of language understanding, learning such latent structure from large corpora can provide background information that can aid machine reading. Computers can use such knowledge to predict what is likely to happen next. Knowledge of discourse is increasingly important for language generation models. While good at capturing surface properties of text – by fusing elements of syntax and style – are still poor at modeling long range dependencies that go across sentences (Li and Jurafsky, 2017; Wang et al, 2017). Models of long range flow in the text can be useful as additional input to such methods

Methods

Results

Conclusion