Abstract

Text summarization is a natural language processing application that is being researched extensively and applied further to reduce the processing time for various long-winded text-based activities. However, as NLP is still in its budding phase, the work is relatively limited to the English language, leaving regional languages rather untouched despite having an incredible following of speakers. Such is also the case with the Hindi language. In this paper, we propose to come up with an effective method of summarisation news articles in the Hindi language. Like the English variant of this application, we wish to emphasize on the important sections of a Hindi news report and summarize it within 60 to 80 words. The summarization technique will try to identify the theme of the news, named entities and numbers, title terms, etc., for constructing a keyword table. This will be further compared against a knowledge base with weighted keywords for ranking the important sentences in the relevant order and finally picking out the sentence most needed for the summary. Our goal for summarizing the Hindi news articles specifically roots from the dilemma that despite these articles being a rich source of opinionated information about various topics, they are often ignored by the readers because of their long-winded nature that makes the useful information lost in the sea of words decorated by winded introductions and linguistic ornaments like idioms. Hence, this system should enable in an effective means of summary for finding useful information along with pruning all such irrelevant details.

Highlights

  • Text summarization and sentiment analysis are two applications of Natural Language Processing that are slowly becoming part of the daily life of the modern era by reducing the time taken for a lot of processes that otherwise required extensive human input and their gathered knowledge base

  • As much as summarization is a problem much worked upon, idiom elimination happened to be one of the major hurdles that we faced as idioms often included terms that had a positive hit on the used features

  • Restricted Boltzmann Machine (RBM) proved to be an effective way of dealing with the problem as it neatly included all the features that were designed to deal with the problem

Read more

Summary

Introduction

Text summarization and sentiment analysis are two applications of Natural Language Processing that are slowly becoming part of the daily life of the modern era by reducing the time taken for a lot of processes that otherwise required extensive human input and their gathered knowledge base. A broader application category of NLP allows for background uses such as monitoring media content all over the world, structuring petabytes of unstructured data in data warehouses of large companies, along with directly. Revised Manuscript Received on April 02, 2020. Pandian*, Associate Professor, Computer Science department, SRM Institute of Science and Technology, Chennai, India. Rajeshwari, Student, Computer Science Department, SRM Institute of Science and. Abhishek Saxena, Student, Computer Science Department, SRM

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call