Optimisation of plagiarism detection using vector space model on CUDA architecture

Jiffriya Mohamed Abdul Cader,Hasindu Gamaarachchi,Roshan G Ragel,Akmal Jahan Mohamed Abdul Cader

doi:10.1504/ijica.2022.125675

Abstract

Plagiarism is a rapidly rising issue among students during submission of assignments, reports and publications in universities and educational institutions, due to easy accessibility of abundant e-resources on the internet. Existing tools become inefficient in terms of time consumption when dealing with the prolific number of documents with large content. Therefore, we have focused on software-based acceleration for plagiarism detection using CPU/GPU. Initially serial version of vector space model was implemented on CPU and tested with 1,000 documents, which consumed 1,641 s. As processing time was a bottleneck of performance, we indented to develop parallel version of the model on the graphics processing units (GPUs) using compute unified device architecture (CUDA) and tested with the same dataset which consumed only 36 s and gained 45x speed up compared to the CPU. Then the version was optimised further and took only 4 s for the same dataset which was 389x faster than the serial implementation.

Full Text