SA-HAVE: A Self-Attention based Hierarchical VAEs Network for Abstractive Summarization

Xia Wan,Shenggen Ju

doi:10.1088/1742-6596/2078/1/012073

Abstract

The abstractive automatic summarization task is to summarize the main content of the article with short sentences, which is an important research direction in natural language generation. Most abstractive summarization models are based on sequence-to-sequence neural networks. Specifically, they encode input text sequences by Bi-directional Long Short-Term Memory (bi-LSTM), and decode summaries word-by-word by LSTM. However, existing models usually did not consider both the self-attention dependence during the encoding process using bi-LSTM, and deep potential sentence structure information for the decoding process. To tackle these limitations, we propose a Self-Attention based word embedding and Hierarchical Variational AutoEncoders (SA-HVAE) model. The model first introduces self-attention into LSTM to alleviate information decay of encoding, and accomplish summarization with deep structure information inference through hierarchical VAEs. The experimental results on the Gigaword and CNN/Daily Mail datasets validate the superior performance of SA-HVAE, and our model has a significant improvement over the baseline model.

Full Text