Abstract
In this era where a large amount of information has flooded the Internet, manual extraction and consumption of relevant information is very difficult and time-consuming. Therefore, an automated document summarization tool is necessary to excerpt important information from a set of documents that have similar or related subjects. Multi-document summarization allows retrieval of important and relevant content from multiple documents while minimizing redundancy. A multi-document text summarization system is developed in this study using an unsupervised extractive-based approach. The proposed model is a fusion of two learning paradigms: the T5 pre-trained transformer model and the K-Means clustering algorithm. We perform the experiments on the benchmark news article corpus Document Understanding Conference (DUC2004). The ROUGE evaluation metrics were used to estimate the performance of the proposed approach on the DUC2004. Outcomes validate that our proposed model shows greatly enhanced performance as compared to the existent unsupervised state-of-the-art approaches.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have