Automatic Keywords Extraction Based on Co-Occurrence and Semantic Relationships Between Words

Xiangke Mao,Shaobin Huang,Rongsheng Li,Linshan Shen

doi:10.1109/access.2020.3004628

Abstract

Automatic keywords extraction is a method that extracts words or phrases from a document which can express the main idea of the document. In this paper, we propose an unsupervised keywords extraction framework for individual documents, which improves the keywords extraction from two aspects. In the step of candidate keywords selection, we use the methods of removing the stopwords, regular matching, and length filtering to reduce the number of candidate keywords, but improve the quality. In the step of scoring words, we use word co-occurrence, semantic relationships (WordNet, Word Embedding, Normalized Google Distance), and three ways to combine word co-occurrence and semantic relationships to measure the weight of edges in the graph model. In experiments, we use Precision, Recall, and F1-measure values as evaluation criteria to compare all keywords extraction methods we proposed with other strong baseline methods in two datasets. According to the results of experiments, methods under our proposed framework achieve good results. We verify that the methods of using both word co-occurrence and semantic relationships have a better effect on keywords extraction than using co-occurrence or semantic relationships only. At the same time, we also find that for the keywords extraction of individual documents, the method of using co-occurrence between words has a better effect than semantic relationships.

Highlights

Automatic keywords extraction (AKE) is a kind of method that automatically catches the theme of one document using a small set of words occurred in the document
AKE is widely used in many natural language processing (NLP) tasks, such as Text Classification (TC) [1], Document Summarization (DS) [2], [3], Information Retrieval (IR) [4], [25] et al For an IR system, keywords can be applied to index documents and improve the accuracy rate of retrieval results
Our methods are verified that the combination of co-occurrence and semantic relationships between words can improve the effectiveness of keywords extraction

Summary

Introduction

Automatic keywords extraction (AKE) is a kind of method that automatically catches the theme of one document using a small set of words occurred in the document. In the age of ‘‘ information explosion’’, AKE is one method for people to learn information quickly from the document ocean. AKE is widely used in many natural language processing (NLP) tasks, such as Text Classification (TC) [1], Document Summarization (DS) [2], [3], Information Retrieval (IR) [4], [25] et al For an IR system, keywords can be applied to index documents and improve the accuracy rate of retrieval results. Keywords can be seen as a condensed summary.

Objectives

Methods

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Automatic Keywords Extraction Based on Co-Occurrence and Semantic Relationships Between Words

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Standard Statistical and Graph based Automatic Keyword Extraction
...
International Journal of Innovative Technology and Exploring Engineering | VOL. 9
, et. al. ...
30 Dec 2020
International Journal of Innovative Technology and Exploring Engineering | VOL. 9

Cluster-based Unsupervised Automatic Keyphrases Extraction algorithms: experimentations on Cultural Heritage datasets
Maria Teresa Artese ... Isabella Gagliardi
Archiving Conference | VOL. 16
Maria Teresa Artese, et. al.Maria Teresa Artese ... Isabella Gagliardi
14 May 2019
Archiving Conference | VOL. 16

Automatic intonation-based keyword extraction from academic discourse
Natalia Bogach ... Artyom Zhuikov
-
Natalia Bogach, et. al.Natalia Bogach ... Artyom Zhuikov
26 Sep 2018
26 Sep 2018

Ensemble of keyword extraction methods and classifiers in text classification
Aytuğ Onan ... Hasan Bulut
Expert Systems with Applications | VOL. 57
Aytuğ Onan, et. al.Aytuğ Onan ... Hasan Bulut
29 Mar 2016
Expert Systems with Applications | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Keywords Extraction Based on Co-Occurrence and Semantic Relationships Between Words

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access