Abstract
Although the Transformer model has outperformed traditional sequence-to-sequence model in a variety of natural language processing (NLP) tasks, it still suffers from semantic irrelevance and repetition for abstractive text summarization. The main reason is that the long text to be summarized is usually composed of multi-sentences and has much redundant information. To tackle this problem, we propose a selective and coverage multi-head attention framework based on the original Transformer. It contains a Convolutional Neural Network (CNN) selective gate, which combines n-gram features with whole semantic representation to obtain core information from the long input sentence. Besides, we use a coverage mechanism in the multi-head attention to keep track of the words which have been summarized. The evaluations on Chinese and English text summarization datasets both demonstrate that the proposed selective and coverage multi-head attention model outperforms the baseline models by 4.6 and 0.3 ROUGE-2 points respectively. And the analysis shows that the proposed model generates the summary with higher quality and less repetition.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.