Abstract

Neural headline generation models have recently shown great results since neural network methods have been applied to text summarization. In this paper, we focus on news headline generation. We propose a news headline generation model based on a generative pre-training model. In our model, we propose a rich features input module. The headline generation model we propose only contains a decoder incorporating the pointer mechanism and the n-gram language features, while other generation models use the encoder-decoder architecture. Experiments on news datasets show that our model achieves comparable results in the field of news headline generation.

Highlights

  • The aim of the text summarization is to condense a document while the condensed content retains the core meaning of the original document

  • We focus on the task of neural headline generation (NHG)

  • Because of the part of speech feature in the input data, the convolution layer in our model and the usage of the generative pre-training architecture incorporating the pointer mechanism, the news headline generation model we propose on LCSTS achieves comparable result with the other methods

Read more

Summary

Introduction

The aim of the text summarization is to condense a document while the condensed content retains the core meaning of the original document. Approaches using neural networks have shown promising results on the headline generation task with endto-end models that encode a source document and decode it into a news headline. The pioneering work of neural headline generation is [1], which uses the encoder-decoder framework to generate sentence-level summarization. With the development of the recurrent neural network (RNN) [2], [3] employed the attentive encoderdecoder model to sentence summarization. The transformer was employed for abstractive summarization [5], but the result was not improved compared with the attentive sequence-to-sequence model. Rothe developed a transformer-based sequence-to-sequence model with the pre-training BERT [7], GPT-2 and RoBERTa [8] checkpoints for sequence generation task [9]. The p is limited to 50,000 in this paper

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call