Paper Similarity Detection Method Based on Distance Matrix Model with Row-Column Order Penalty Factor

Jun Li,Junshan Pan,Yaqing Han

doi:10.4304/jmm.9.8.998-1004

Abstract

Paper similarity detection depends on grammatical and semantic analysis, word segmentation, similarity detection, document summarization and other technologies, involving multiple disciplines. However, there are some problems in the existing main detection models, such as incomplete segmentation preprocessing specification, impact of the semantic orders on detection, near-synonym evaluation, difficulties in paper backtrack and etc. Therefore, this paper presents a two-step segmentation model of special identifier and Sharpley value specific to above problems, which can improve segmentation accuracy. In the aspect of similarity comparison, a distance matrix model with row-column order penalty factor is proposed, which recognizes new words through search engine exponent. This model integrates the characteristics of vector detection, hamming distance and the longest common substring and carries out detection specific to near-synonyms, word deletion and changes in word order by redefining distance matrix and adding ordinal measures, making sentence similarity detection in terms of semantics and backbone word segmentation more effective. Compared with the traditional paper similarity retrieval, the present method has advantages in accuracy of word segmentation, low computation, reliability and high efficiency, which is of great academic significance in word segmentation, similarity detection and document summarization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Paper Similarity Detection Method Based on Distance Matrix Model with Row-Column Order Penalty Factor

Abstract

Talk to us

Similar Papers

More From: Journal of Multimedia

Lead the way for us

Journal: Journal of Multimedia	Publication Date: Aug 22, 2014
Citations: 1

Similar Papers

A Similarity Detection Method Based on Distance Matrix Model with Row-Column Order penalty Factor
Jun Li ... Yaqing Han
Bulletin of Electrical Engineering and Informatics | VOL. 3
Jun Li, et. al.Jun Li ... Yaqing Han
01 Dec 2014
Bulletin of Electrical Engineering and Informatics | VOL. 3

New Words Discovery Method Based On Word Segmentation Result
Heyang Liu ... Pengdong Gao
-
Heyang Liu, et. al.Heyang Liu ... Pengdong Gao
01 Jun 2018
01 Jun 2018

Unsupervised word segmentation for Sesotho using Adaptor Grammars
Mark Johnson
-
Mark JohnsonMark Johnson
01 Jan 2008
01 Jan 2008

Design of Korean Text Resource Information Mining and Management Platform
Chunying Wang
-
Chunying WangChunying Wang
02 Dec 2021
02 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Paper Similarity Detection Method Based on Distance Matrix Model with Row-Column Order Penalty Factor

Abstract

Talk to us

Similar Papers

More From: Journal of Multimedia