A two-step abstractive summarization model with asynchronous and enriched-information decoding

Shuaimin Li,Jungang Xu

doi:10.1007/s00521-020-05005-3

Abstract

Most sequence-to-sequence abstractive summarization models generate the summaries based on the source article and the generated words, but they often neglect the future information implied in the un-generated words, which means that they lack the ability of “lookahead.” In this paper, we present a novel summarization model with “lookahead” ability to fully employ the implied future information. Our model takes two steps: (1) in the first step, an asynchronous decoder model with a no ground truth guiding backward decoder that explicitly produces and exploits the future information is trained. (2) in the inference process, in addition to the joint probability of the generated sequence, an enriched-information decoding method is proposed to further take future ROUGE reward of the un-generated words into account. Furthermore, the future ROUGE reward is predicted by a novel reward-predict model, and it takes the hidden states of the pre-trained asynchronous decoder model as input. Experimental results show that our two-step summarization model achieves new state-of-the-art results on CNN/Daily Mail dataset and the generalization of our model on test-only DUC-2002 datasets achieves higher scores than the state-of-the-art model.

Full Text