Methods for cross-language plagiarism detection

Alberto Barrón-Cedeño,Parth Gupta,Paolo Rosso

doi:10.1016/j.knosys.2013.06.018

Abstract

Three reasons make plagiarism across languages to be on the rise: (i) speakers of under-resourced languages often consult documentation in a foreign language, (ii) people immersed in a foreign country can still consult material written in their native language, and (iii) people are often interested in writing in a language different to their native one. Most efforts for automatically detecting cross-language plagiarism depend on a preliminary translation, which is not always available.In this paper we propose a freely available architecture for plagiarism detection across languages covering the entire process: heuristic retrieval, detailed analysis, and post-processing. On top of this architecture we explore the suitability of three cross-language similarity estimation models: Cross-Language Alignment-based Similarity Analysis (CL-ASA), Cross-Language Character n-Grams (CL-CNG), and Translation plus Monolingual Analysis (T+MA); three inherently different models in nature and required resources.The three models are tested extensively under the same conditions on the different plagiarism detection sub-tasks—something never done before. The experiments show that T+MA produces the best results, closely followed by CL-ASA. Still CL-ASA obtains higher values of precision, an important factor in plagiarism detection when lesser user intervention is desired.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Knowledge-Based Systems	Publication Date: Jul 3, 2013
Citations: 77	License type: other-oa

R Discovery Prime

R Discovery Prime

Methods for cross-language plagiarism detection

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Similar Papers

On the mono- and cross-language detection of text reuse and plagiarism
Alberto Barrón-Cedeño
-
Alberto Barrón-CedeñoAlberto Barrón-Cedeño
19 Jul 2010
19 Jul 2010

Cross-language plagiarism detection over continuous-space- and knowledge graph-based representations of language
Marc Franco-Salvador ... Rafael E Banchs
Knowledge-Based Systems | VOL. 111
Marc Franco-Salvador, et. al.Marc Franco-Salvador ... Rafael E Banchs
06 Aug 2016
Knowledge-Based Systems | VOL. 111

Arabic English Cross-Lingual Plagiarism Detection Based on Keyphrases Extraction, Monolingual and Machine Learning Approach
Mohammed Albared ... Muneer A S Hazaa
Asian Journal of Research in Computer Science | VOL. -
Mohammed Albared, et. al.Mohammed Albared ... Muneer A S Hazaa
13 Feb 2019
Asian Journal of Research in Computer Science | VOL. -

Plagiarism Detection Tools in Learning Management Systems
Sergey Butakov ... Vladislav Shcherbinin
-
Sergey Butakov, et. al.Sergey Butakov ... Vladislav Shcherbinin
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Methods for cross-language plagiarism detection

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems