Abstract
Aiming at the difficulties in document-level summarization, this paper presents a two-stage, extractive and then abstractive summarization model. In the first stage, we extract the important sentences by combining sentences similarity matrix (only used for the first time) or pseudo-title, which takes full account of the features (such as sentence position, paragraph position, and more.). To extract coarse-grained sentences from a document, and considers the sentence differentiation for the most important sentences in the document. The second stage is abstractive, and we use beam search algorithm to restructure and rewrite these syntactic blocks of these extracted sentences. Newly generated summary sentence serves as the pseudo-summary of the next round. Globally optimal pseudo-title acts as the final summarization. Extensive experiments have been performed on the corresponding data set, and the results show our model can obtain better results.
Highlights
With the explosive growth of text data on the web, how quickly obtain the nut graph or thematic meaning of long text is a vitally important research in natural language processing
The main reason is that the model we proposed combines a variety of features including deep learning features, and text display features, which has more semantic representation capabilities
2: According to the dependency relationship of the words in the sentences, we divide the corresponding sentence into different dependency syntactic blocks, and record the dependency relationship of the syntactic block: si _block block1,block2,block3,blockn ; 3: Normalize the syntactic blocks, and fulfill the missing syntax chunks according to the context; 4: Use beam search algorithm on the generated syntactic blocks; 5: Scoring the generated summary sentences, the highest Ssummary serves as summarization
Summary
This work was supported in part by the Shandong Province Social Science Popularization and Application Research Project under Grant 2020-SKZZ-51, in part by The Social Science Planning Office of Heze City under Grant 2020_zz_55, and in part by Heze University Doctoral Research and Development Fund under Grant XY20BS19, and in part by Shandong Province Educational Science Planning under Grant BYZN201910.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.