Keyword extraction based on tf/idf for Chinese news document

Juanzi Li,Qi’Na Fan,Kuo Zhang

doi:10.1007/s11859-007-0038-4

Abstract

Keyword extraction is an important research topic of information retrieval. This paper gave the specification of keywords in Chinese news documents based on analyzing linguistic characteristics of news documents and then proposed a new keyword extraction method based on tf/idf with multi-strategies. The approach selected candidate keywords of uni-, bi-and tri- grams, and then defines the features according to their morphological characters and context information. Moreover, the paper proposed several strategies to amend the incomplete words gotten from the word segmentation and found unknown potential keywords in news documents. Experimental results show that our proposed method can significantly outperform the baseline method. We also applied it to retrospective event detection. Experimental results show that the accuracy and efficiency of news retrospective event detection can be significantly improved.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Keyword extraction based on tf/idf for Chinese news document

Abstract

Talk to us

Similar Papers

More From: Wuhan University Journal of Natural Sciences

Lead the way for us

Journal: Wuhan University Journal of Natural Sciences	Publication Date: Sep 1, 2007
Citations: 99

Similar Papers

Keyword Extraction Using Support Vector Machine
Kuo Zhang ... Hui Xu
-
Kuo Zhang, et. al.Kuo Zhang ... Hui Xu
01 Jan 2006
01 Jan 2006

Complex Network based Supervised Keyword Extractor
Swagata Duari ... Vasudha Bhatnagar
Expert Systems with Applications | VOL. 140
Swagata Duari, et. al.Swagata Duari ... Vasudha Bhatnagar
16 Aug 2019
Expert Systems with Applications | VOL. 140

Keyword Extraction Based on Selectivity and Generalized Selectivity
Slobodan Beliga ... Sanda Martinčić-Ipšić
-
Slobodan Beliga, et. al.Slobodan Beliga ... Sanda Martinčić-Ipšić
01 Jan 2018
01 Jan 2018

Ensemble of keyword extraction methods and classifiers in text classification
Aytuğ Onan ... Hasan Bulut
Expert Systems with Applications | VOL. 57
Aytuğ Onan, et. al.Aytuğ Onan ... Hasan Bulut
29 Mar 2016
Expert Systems with Applications | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Keyword extraction based on tf/idf for Chinese news document

Abstract

Talk to us

Similar Papers

More From: Wuhan University Journal of Natural Sciences