Abstract

The process of shortening text documents but preserving their overall context and content is text summarization. The key concepts of any text document should be illustrated by a good summary. Text summarization is a key area of Natural Language Processing (NLP) and it uses different NLP tools to extract meaningful information from given text. Abstractive Text Summarization (ATS) and Extractive Text Summarization (ETS) are two main techniques for Text Summarization. In this paper, Punjabi Extractive Text Summarizer is developed by using the unsupervised machine learning approach. The methodology consisting of various modules such as tokenization of the Punjabi text, removal of stop words, generation of the similarity matrix, ranking based on similarity matrix, and summary generation is proposed. The proposed system has been evaluated using the different ROGUE scores (ROUGE-N, ROUGE-L, and ROUGE-S). After evaluation, the three highest scores taken are ROUGE-1 with 0.71 score, ROUGE-L with 0.56 score, and ROUGE-S with 0.56 score.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call