Extractive Summarization - A Comparison of Pre-Trained Language Models and Proposing a Hybrid Approach

Jigisha M Narrain,Jahnavi Sivaram,Vanshika Taneja,Sanjana B Atrey,Dinesh Singh

doi:10.1109/wisscon56857.2023.10133863

Abstract

The automatic summarization of technical articles is a field that has garnered a fair amount of interest, and one that enjoys a significant portion of NLP-related research. As a whole, automatic summarization can be split into two broad categories - extractive and abstractive. Extractive summarization implies that important and relevant sentences are picked from the article as is, and inserted in the summary. Abstractive summarization, on the other hand, requires contextual understanding of the document, and rearranging and shortening the sentences, while maintaining the core essence of the article. Multiple algorithms have been proposed for both these classes of automatic summarization. In the recent past, the emergence of pre-trained language models for NLP tasks have been heralded by the creation of attention mechanisms and Transformers. These models implement encoder-decoder structures, and have far wider applications than previously utilised algorithms. Four such pretrained models are - BERT, BART, XLNet and GPT-2. In this project, we use transfer learning to fit these models on our corpus of Medium articles, fine-tuning them for the task of summarization. Further, we generate online summaries of the data, and use ROUGE metrics to evaluate the accuracy of the model-generated summaries alongside those. We also implement the popular Word2Vec model on the same data, and compare its result with the ones obtained from the attention based models, once again using ROUGE metrics. In addition, we explore the effectiveness of a hybrid approach to the summarization task, by using different combinations of the models on the same article to investigate the results of the same.

Full Text