Abstract

Keyword extraction refers to the process of detecting the most relevant terms and expressions in a given text in a timely manner. In the information explosion era, keyword extraction has attracted increasing attention. The importance of keyword extraction in text summarization, text comparisons, and document categorization has led to an emphasis on graph-based keyword extraction techniques because they can capture more structural information compared to other classic text analysis methods. In this paper, we propose a simple unsupervised text mining approach that aims to extract a set of keywords from a given text and analyze its topic diversity using graph analysis tools. Initially, the text is represented as a directed graph using synonym relationships. Then, community detection and other measures are used to identify keywords in the text. The set of extracted keywords is used to assess topic diversity within the text and analyze its sentiment. The proposed approach relies on grouping semantically similar candidate words. This approach ensures that the set of extracted keywords is comprehensive. Differing from other graph-based keyword extraction approaches, the proposed method does not require user parameters during graph construction and word scoring. The proposed approach achieved significant results compared to other keyword extraction techniques.

Highlights

  • Social media outlets produce extremely large amounts of data

  • Keyword extraction approaches can be categorized as statistical [5,6,7], machine learning [8, 9], linguistic [10], and graph-based approaches [3, 11,12,13,14,15,16,17,18]

  • YAKE computes a score for each term based on five features: case, position, frequency, relatedness to context, and how often a candidate word appears in different sentences

Read more

Summary

Introduction

Social media outlets produce extremely large amounts of data. Text analysis provides an effective way to process and utilize the most relevant data. Such analysis supports various applications in different domains, such as marketing, content filtering, and search. Manual processing of the huge number of documents available online is tedious, time-consuming, and error-prone. Text mining refers to the automatic extraction of information and the identification of valuable and previously unknown hidden patterns from unstructured textual data [1]. Text mining algorithms make it possible to process huge amounts of unstructured textual data efficiently and effectively

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.