Intrinsic plagiarism analysis

Benno Stein,Nedim Lipka,Peter Prettenhofer

doi:10.1007/s10579-010-9115-y

Abstract

Research in automatic text plagiarism detection focuses on algorithms that compare suspicious documents against a collection of reference documents. Recent approaches perform well in identifying copied or modified foreign sections, but they assume a closed world where a reference collection is given. This article investigates the question whether plagiarism can be detected by a computer program if no reference can be provided, e.g., if the foreign sections stem from a book that is not available in digital form. We call this problem class intrinsic plagiarism analysis; it is closely related to the problem of authorship verification. Our contributions are threefold. (1) We organize the algorithmic building blocks for intrinsic plagiarism analysis and authorship verification and survey the state of the art. (2) We show how the meta learning approach of Koppel and Schler, termed “unmasking”, can be employed to post-process unreliable stylometric analysis results. (3) We operationalize and evaluate an analysis chain that combines document chunking, style model computation, one-class classification, and meta learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Intrinsic plagiarism analysis

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation

Lead the way for us

Journal: Language Resources and Evaluation	Publication Date: Jan 20, 2010
Citations: 188

Similar Papers

TDRLM: Stylometric learning for authorship verification by Topic-Debiasing
Xinyu Hu ... Hanbo Yu
Expert Systems with Applications | VOL. 233
Xinyu Hu, et. al.Xinyu Hu ... Hanbo Yu
16 Jun 2023
Expert Systems with Applications | VOL. 233

E-mail authorship verification for forensic investigation
Farkhund Iqbal ... Mourad Debbabi
-
Farkhund Iqbal, et. al.Farkhund Iqbal ... Mourad Debbabi
22 Mar 2010
22 Mar 2010

Meta Learning for Few-Shot One-Class Classification
Gabriel Dahia ... Maurício Pamplona Segundo
AI | VOL. 2
Gabriel Dahia, et. al.Gabriel Dahia ... Maurício Pamplona Segundo
22 Apr 2021
AI | VOL. 2

A Unified Approach to Authorship Attribution and Verification
Xavier Puig ... Josep Ginebra
The American Statistician | VOL. 70
Xavier Puig, et. al.Xavier Puig ... Josep Ginebra
02 Jul 2016
The American Statistician | VOL. 70

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Intrinsic plagiarism analysis

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation