Keyword extraction using supervised cumulative TextRank

Monali Bordoloi,Biswajit Purkayastha,Saroj Kumar Biswas,Preetam Chayan Chatterjee

doi:10.1007/s11042-020-09335-1

Monali Bordoloi, Biswajit Purkayastha + Show 2 more

https://doi.org/10.1007/s11042-020-09335-1

Copy DOI

Abstract

Keyword extraction is a major step to extract plenty of valuable and meaningful information from the rich source of World Wide Web (W.W.W.). Different keyword extraction algorithms are proposed with their own advantages and disadvantages. Vector Space Model (VSM) algorithms prove quite effective for keyword extraction, but do not emphasize on the class label information of classified data. Supervised Term Weighting (STW) algorithms address this problem, but suffer from high dimensionality. Besides, they do not incorporate semantic relationship between terms. To address these problems, Graph Based Models (GBM) are introduced. However, they also use unsupervised learning. Hence, this paper proposes a Keyword Extraction using Supervised Cumulative TextRank (KESCT) technique that explores the benefits of both VSM and GBM techniques. The proposed algorithm modifies TextRank by incorporating a novel Unique Statistical Supervised Weight (USSW) to include class label information of classified data. To emphasize on the relatedness between terms, the mutual information between terms is also included. The proposed algorithm is validated using four review datasets and results are compared with traditional TextRank and its variants using Support Vector Machine (SVM) classifier, Naive-Bayes (NB) classifier and an ensemble classifier. Experimental results mark the efficacy of the proposed algorithm over existing algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Keyword extraction using supervised cumulative TextRank

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications

Lead the way for us

Journal: Multimedia Tools and Applications	Publication Date: Aug 21, 2020
Citations: 11

Similar Papers

A support vector machine classifier reduces interscanner variation in the HRCT classification of regional disease pattern in diffuse lung disease: Comparison to a Bayesian classifier
Yongjun Chang ... Jonghyuck Lim
Medical Physics | VOL. 40
Yongjun Chang, et. al.Yongjun Chang ... Jonghyuck Lim
24 Apr 2013
Medical Physics | VOL. 40

Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification
Jie Hu ... Shaobo Li
Entropy | VOL. 20
Jie Hu, et. al.Jie Hu ... Shaobo Li
02 Feb 2018
Entropy | VOL. 20

A boosted SVM based ensemble classifier for sentiment analysis of online reviews
Anuj Sharma ... Shubhamoy Dey
ACM SIGAPP Applied Computing Review | VOL. 13
Anuj Sharma, et. al.Anuj Sharma ... Shubhamoy Dey
01 Dec 2013
ACM SIGAPP Applied Computing Review | VOL. 13

Text Keyword Extraction Based on Multi-dimensional Features
Yu Jin ... Lizhen Xu
-
Yu Jin, et. al.Yu Jin ... Lizhen Xu
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Keyword extraction using supervised cumulative TextRank

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications