A Text Similarity Measurement Based on Semantic Fingerprint of Characteristic Phrases

Shanchen Pang,Hua Zhao,Ting Liu,Hongqi Chen,Jiamin Yao

doi:10.1049/cje.2019.12.011

Abstract

Text similarity measurements are the basis for measuring the degree of matching between two or more texts. Traditional large-scale similarity detection methods based on a digital fingerprint have the advantage of high detection speed, which are only suitable for accurate detection. We propose a method of Chinese text similarity measurement based on feature phrase semantics. Natural language processing (NLP) technology is used to pre-process text and extract the keywords by the Term frequency-Inverse document frequency (TF-IDF) model and further screen out the feature words. We get the exact meaning of a word and semantic similarities between words and a HowNet semantic dictionary. We substitute concepts to get the feature phrases and generate a semantic fingerprint and calculate similarity. The experimental results indicate that the method proposed is superior in similarity detection in terms of its accuracy rate, recall rate, and F-value to the traditional and digital fingerprinting method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Text Similarity Measurement Based on Semantic Fingerprint of Characteristic Phrases

Abstract

Talk to us

Similar Papers

More From: Chinese Journal of Electronics

Lead the way for us

Journal: Chinese Journal of Electronics	Publication Date: Mar 1, 2020
Citations: 8

Similar Papers

Natural Language Processing Technology Used in Artificial Intelligence Scene of Law for Human Behavior
Jin Ning
Wireless Communications and Mobile Computing | VOL. 2022
Jin NingJin Ning
24 Mar 2022
Wireless Communications and Mobile Computing | VOL. 2022

Natural Language Processing for the Semantic Web
Diana Maynard ... Isabelle Augenstein
-
Diana Maynard, et. al.Diana Maynard ... Isabelle Augenstein
01 Jan 2017
01 Jan 2017

Finding warning markers: Leveraging natural language processing and machine learning technologies to detect risk of school violence
Yizhao Ni ... Michael Sorter
International Journal of Medical Informatics | VOL. 139
Yizhao Ni, et. al.Yizhao Ni ... Michael Sorter
25 Apr 2020
International Journal of Medical Informatics | VOL. 139

Leveraging AI automated emergency response with natural language processing: Enhancing real-time decision making and communication
Zesheng Li
Applied and Computational Engineering | VOL. 71
Zesheng LiZesheng Li
27 Aug 2024
Applied and Computational Engineering | VOL. 71

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Text Similarity Measurement Based on Semantic Fingerprint of Characteristic Phrases

Abstract

Talk to us

Similar Papers

More From: Chinese Journal of Electronics