Comparison of parallel central processing unit‐ and graphics processing unit‐based implementations of greedy string tiling algorithm for source code plagiarism detection

Marko J Mišić,Milo V Tomašević

doi:10.1002/cpe.7135

Abstract

SummaryMassive‐enrollment computing courses often involve some practical training through programming assignments and projects that are frequent targets for plagiarism. Source code similarity detection tools are used to prevent such misbehavior. Parallel processing has recently become a viable technique for speeding up the processing of large workloads. This article examines the parallelization of a source code similarity detection method based on the greedy string tiling and Karp–Rabin algorithms. Both CPU and GPU parallelization approaches are discussed. The CPU implementation uses Pthreads, whereas the GPU implementation employs CUDA. Depending on the evaluated dataset which consists of real student assignment codes, speedups of up to seven times over the sequential version of the code are achieved. Evaluation results on both platforms are compared and discussed in detail.

Full Text