Unsupervised Model for Detecting Plagiarism in Internet-based Handwritten Arabic Documents

Mahmoud Zaher,Abdulaziz Shehab,Farahat Farag Farahat,Mohamed Elhoseny

doi:10.4018/joeuc.2020040103

Abstract

Due to the rapid increase of internet-based data, there is urgent need for a robust intelligent documents security mechanism. Although there are many attempts to build a plagiarism detection system in natural language documents, the unlimited variation and different writing styles of each character in Arabic documents make building such systems challenging. Based on its position in a word, the same Arabic letter can be written three different ways, which makes the handwritten character recognition a cumbersome process. This article proposes an intelligent unsupervised model to detect plagiarism in these documents called ASTAP. First, a handwritten Arabic character recognition system is proposed using the Grey Wolf Optimization (GWO) algorithm. Then, a modified Abstract Syntax Tree (AST) is used to match the contents of the Arabic documents to detect any similarity. Compared to the state-of-the-art methods, ASTAP improves the effectiveness of the plagiarism detection in terms of the matched similarity ratio, the precision ratio, and the processing time.

Highlights

Due to the rapid increase of internet-based data, there is urgent need for a robust intelligent documents security mechanism
Compared to the state-of-the-art methods, ASTAP improves the effectiveness of the plagiarism detection in terms of the matched similarity ratio, the precision ratio, and the processing time
The performance of the proposed ASTAP is calculated in different terms, such as the similarity, the

Summary

RELATED WORKS

Many types of research aim to detect documents similarities. Despite the enormous efforts to discover the similarity of Arabic documents, most of the previous work focused the electronic form of these documents. Plaggie outcomes by configuration parameters wanted, Plaggie looks like JPlag, but it is a Java application stand_alone command_line and has to be set up locally They use tokenization and Greedy String Tiling algorithms for detecting matches between two source code files namely followed by the GST (Greedy_String_Tiling) algorithm. (Borner, et al, 2012) have introduced a system to check the Plagiarism in Arabic documents It uses tokenization, removes stop-words, and convert the words to their roots in the preprocessing phase, after that the words are switched to their synonyms. Radial basis function (RBF) kernel function is selected and used in this paper, as it’s the most sufficient for SVM In different applications such as image and document processing applications (Yuan, et al, 2017) Abstract Syntax Tree (AST) is a similarity detection (Zaher, et al, 2017) algorithm that analyzes similar detection schemes to detect plagiarism efficiently. Initialize: ci:= node i, TMC:= count for all nodes, x:= node value for each ci in {TT} do if 0 < n < TMC ks:= x + ci

11 Endfor

RESULTS AND DISCUSSION

Evaluation Criteria and Datasets

Results and Discussion

CONCLUSION AND FUTURE WORK

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Organizational and End User Computing	Publication Date: Apr 1, 2020
Citations: 32	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Unsupervised Model for Detecting Plagiarism in Internet-based Handwritten Arabic Documents

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Organizational and End User Computing

Lead the way for us

Similar Papers

Bio-Inspired Optimization Algorithms for Arabic Handwritten Characters
Ahmed.T Sahlol ... Aboul Ella Hassanien
-
Ahmed.T Sahlol, et. al.Ahmed.T Sahlol ... Aboul Ella Hassanien
01 Jan 2017
01 Jan 2017

Enhanced ResNet-151-based fused features for optimized Bi-LSTM-DNN-aided handwritten character and digits recognition
Srinivasa Rao N ... Nelson Kennedy Babu C
Expert Systems with Applications | VOL. 244
Srinivasa Rao N, et. al.Srinivasa Rao N ... Nelson Kennedy Babu C
08 Dec 2023
Expert Systems with Applications | VOL. 244

Character Recognition Tamil Language in Printed Images using Convolutional Neural Network (CNN) analysis
M Chithambarathanu ... Dr Ganesh
-
M Chithambarathanu, et. al.M Chithambarathanu ... Dr Ganesh
16 Dec 2021
16 Dec 2021

필기체 한글 문자 인식을 위한 획 추출에 관한 연구
Young-Kyoo Choi ... Sang-Burm Rhee
The KIPS Transactions:PartB | VOL. 9B
Young-Kyoo Choi, et. al.Young-Kyoo Choi ... Sang-Burm Rhee
01 Jun 2002
The KIPS Transactions:PartB | VOL. 9B

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Model for Detecting Plagiarism in Internet-based Handwritten Arabic Documents

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Organizational and End User Computing