An Improved DE Algorithm to Optimise the Learning Process of a BERT-based Plagiarism Detection Model

Seyed Vahid Moravvej,Seyed Jalaleddin Mousavirad,Zahra Sobhaninia,Gerald Schaefer,Diego Oliva

doi:10.1109/cec55065.2022.9870280

Abstract

Plagiarism detection is a challenging task, aiming to identify similar items in two documents. In this paper, we present a novel approach to automatic plagiarism detection that combines BERT (bidirectional encoder representations from transformers) word embedding, attention mechanism-based long short-term memory (LSTM) networks, and an improved differential evolution (DE) algorithm for weight initialisation. BERT is used to pretrain deep bidirectional representations in all layers, while the pre-trained BERT model can be fine-tuned with only one extra output layer without significant changes in architecture. Deep learning algorithms often use the random weighting method for initialisation, followed by gradient-based optimisation algorithms such as back-propagation for training, making them susceptible to getting trapped in local optima. To address this, population- based metaheuristic algorithms such as DE can be used. We propose an improved DE algorithm with a clustering-based mutation operator, where first a winning cluster of candidate solutions is identified and a new updating strategy is then applied to include new candidate solutions in the current population. The proposed DE algorithm is used in LSTM, attention mechanism, and feed- forward neural networks to yield the initial seeds for subsequent gradient-based optimisation. We compare our proposed model with conventional and population-based approaches on three datasets (SNLI, MSRP and SemEval2014) and demonstrate it to give superior plagiarism detection performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Improved DE Algorithm to Optimise the Learning Process of a BERT-based Plagiarism Detection Model

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

U-Shaped Assembly Line Balancing by Using Differential Evolution Algorithm
Poontana Sresracoo ... Nuchsara Kriengkorakot
Mathematical and Computational Applications | VOL. 23
Poontana Sresracoo, et. al.Poontana Sresracoo ... Nuchsara Kriengkorakot
12 Dec 2018
Mathematical and Computational Applications | VOL. 23

An Automated Toxicity Classification on Social Media Using LSTM and Word Embedding.
Ahmad Alsharef ... Karan Aggarwal
Computational Intelligence and Neuroscience | VOL. 2022
Ahmad Alsharef, et. al.Ahmad Alsharef ... Karan Aggarwal
15 Feb 2022
Computational Intelligence and Neuroscience | VOL. 2022

Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation.
Yen-Pin Chen ... Feipei Lai
JMIR Medical Informatics | VOL. 8
Yen-Pin Chen, et. al.Yen-Pin Chen ... Feipei Lai
29 Apr 2020
JMIR Medical Informatics | VOL. 8

Bidirectional encoders to state-of-the-art: a review of BERT and its transformative impact on natural language processing
Rajesh Gupta
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3
Rajesh GuptaRajesh Gupta
02 Mar 2024
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Improved DE Algorithm to Optimise the Learning Process of a BERT-based Plagiarism Detection Model

Abstract

Talk to us

Similar Papers