Abstract
Document summarization is one such task of the natural language processing which deals with the long textual data to make its concise and fluent summaries that contains all of document relevant information. The Branch of NLP that deals with it, is automatic text summarizer. Automatic text summarizer does the task of converting the long textual document into short fluent summaries. There are generally two ways of summarizing text using automatic text summarizer, first is using extractive text summarizer and another abstractive text summarizer. This paper has demonstrated an experiment in contrast with the extractive text summarizer for summarizing the text. On the other hand topic modelling is a NLP task that extracts the relevant topic from the textual document. One such method is Latent semantic Analysis (LSA) using truncated SVD which extracts all the relevant topics from the text. This paper has demonstrated the experiment in which the proposed research work will be summarizing the long textual document using LSA topic modelling along with TFIDF keyword extractor for each sentence in a text document and also using BERT encoder model for encoding the sentences from textual document in order to retrieve the positional embedding of topics word vectors. The algorithm proposed algorithm in this paper is able to achieve the score greater than that of text summarization using Latent Dirichlet Allocation (LDA) topic modelling.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have