MS-Pointer Network: Abstractive Text Summary Based on Multi-Head Self-Attention

Qian Guo,Jifeng Huang,Naixue Xiong,Pan Wang

doi:10.1109/access.2019.2941964

Abstract

Abstractive text summarization plays an important role in the field of natural language processing. However, the abstractive text summary adopts deep learning research method to predict words often appears semantic inaccuracy and repetition and so on. at the present stage, in order to solve the problem that semantic inaccuracy, we propose an MS-Pointer Network that based on the multi-head self-attention mechanism, which a multi-head self-attention mechanism is introduced in the basic encoder-decoder model. Since multi-head self-attention can combine input words into the encoder-decoder arbitrarily, and given a higher weight of these words that combination of the semantics, thereby achieving the purpose of enhancing the semantic features of the text, so that the abstractive text summary is more semantically structured, And the multi-head self-attention mechanism add the position information of the input text, which can enhance the semantic representation of the text. At the same time, in order to solve the problem of out of vocabulary, a pointer network is introduced on the seqtoseq with a multi-head attention mechanism. The model is referred to as MS-Pointer Network. We used CNN/Daily Mail and Gigaword datasets to validate our model, and uses the ROUGE metric to measure model. Experiments have shown that abstractive text summaries generated using the multi-head self-attention mechanism outperforming current open state-of-the-art two points averagely.

Highlights

The study of text summarization is an important area of research in the field of natural language processing
The principle of the ROUGE scoring mechanism is to calculate the score based on the overlap rate of the summary generated by the decoder and the reference summary, so the higher the coincidence degree is, the higher the ROUGE score will be
In this paper, we propose a MS-Pointer Network model to generate abstractive text summary, and introduce a multiheaded self-attention mechanism in the model to enhance the semantic information representation of the input text, so that the decoder can have closer meanings to original text and this can achieve the abstractive understanding of semantic information when predicting the output text summary

Summary

Introduction

The study of text summarization is an important area of research in the field of natural language processing. The current text summary is divided into two categories: extractive text summaries and abstractive text summaries. The extracted text summary mainly uses Sorting algorithm [1], which means extracting sentences from the original text and combine them, in this way, a text summary is created. Abstractive text summaries extract the semantic features of the original text and blend these features together, so a summary that matches the original meaning is generated; for this reason, abstractive text summaries are more suitable in process of artificial summarizing text. Abstractive text summaries are a challenging task for generating grammatically compliant plain text summary

Methods

Results

Conclusion