Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches

Muhammad Aman,Said Jadid Abdul Kadir,Abas Bin Md Said,Israr Ullah

doi:10.3390/info9050128

Muhammad Aman, Said Jadid Abdul Kadir + Show 2 more

Open Access

https://doi.org/10.3390/info9050128

Copy DOI

Abstract

Automatic key concept extraction from text is the main challenging task in information extraction, information retrieval and digital libraries, ontology learning, and text analysis. The statistical frequency and topical graph-based ranking are the two kinds of potentially powerful and leading unsupervised approaches in this area, devised to address the problem. To utilize the potential of these approaches and improve key concept identification, a comprehensive performance analysis of these approaches on datasets from different domains is needed. The objective of the study presented in this paper is to perform a comprehensive empirical analysis of selected frequency and topical graph-based algorithms for key concept extraction on three different datasets, to identify the major sources of error in these approaches. For experimental analysis, we have selected TF-IDF, KP-Miner and TopicRank. Three major sources of error, i.e., frequency errors, syntactical errors and semantical errors, and the factors that contribute to these errors are identified. Analysis of the results reveals that performance of the selected approaches is significantly degraded by these errors. These findings can help us develop an intelligent solution for key concept extraction in the future.

Highlights

The key concepts in an ontology of a specific domain represent a set of important entities’ classes or objects [1,2]
By dipping in depth to determine, why TopicRank performing low and behaves differently in an unstable way on SemEval-2010 and Quranic dataset, we found that the main responsibility lies in the way of generating topics and their weighting
We present the overall performance of the above methods in terms Average Precision (AP), which measures that how early in the ranking list a ranking algorithm fills the position

Summary

Introduction

The key concepts in an ontology of a specific domain represent a set of important entities’ classes or objects [1,2] Extracting these key concepts automatically is a fundamental and challenging step in Ontology Learning. In order to utilize the potential of these approaches for improving key concept identification, we need to thoroughly analyze the performance of the methods based on these approaches, on datasets from different domains, and investigate the underlying reasons and error sources in case of poor results. To gain better understanding of the approaches by identifying their shortcomings, and to provide future research directions, we examine three state-of-the-art methods and evaluate their performance on three different datasets We will describe these datasets later in the analysis section.

Related Work

Common Extraction Steps

TF-IDF

KP-Miner

TopicRank

Comparative Analysis

Experimental Setup

Performance Measures

Individual Performance

Overall Performance

Method

Error Source Analysis

Conclusions

Findings

Limitations

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information	Publication Date: May 18, 2018
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
Muhammad Aman ... Izzatdin Abdul Aziz
Multimedia Tools and Applications | VOL. 80
Muhammad Aman, et. al.Muhammad Aman ... Izzatdin Abdul Aziz
11 Jan 2021
Multimedia Tools and Applications | VOL. 80

Pre-attentive detection of syntactic and semantic errors
Klaus Mathiak ... Christian Dobel
NeuroReport | VOL. 16
Klaus Mathiak, et. al.Klaus Mathiak ... Christian Dobel
01 Jan 2004
NeuroReport | VOL. 16

Evaluating Face Validity of an Arabic Translation of a Food Security Questionnaire
F Al-Ghadban
Journal of Nutrition Education and Behavior | VOL. 44
F Al-GhadbanF Al-Ghadban
26 Jun 2012
Journal of Nutrition Education and Behavior | VOL. 44

Discovering key concepts in verbose queries
W Bruce Croft ... Michael Bendersky
-
W Bruce Croft, et. al.W Bruce Croft ... Michael Bendersky
20 Jul 2008
20 Jul 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information