GPTSniffer: A CodeBERT-based classifier to detect source code written by ChatGPT

Phuong T Nguyen,Juri Di Rocco,Claudio Di Sipio,Riccardo Rubei,Davide Di Ruscio,Massimiliano Di Penta

doi:10.1016/j.jss.2024.112059

Abstract

Since its launch in November 2022, ChatGPT has gained popularity among users, especially programmers who use it to solve development issues. However, while offering a practical solution to programming problems, ChatGPT should be used primarily as a supporting tool (e.g., in software education) rather than as a replacement for humans. Thus, detecting automatically generated source code by ChatGPT is necessary, and tools for identifying AI-generated content need to be adapted to work effectively with code. This paper presents GPTSniffer– a novel approach to the detection of source code written by AI – built on top of CodeBERT. We conducted an empirical study to investigate the feasibility of automated identification of AI-generated code, and the factors that influence this ability. The results show that GPTSniffer can accurately classify whether code is human-written or AI-generated, outperforming two baselines, GPTZero and OpenAI Text Classifier. Also, the study shows how similar training data or a classification context with paired snippets helps boost the prediction. We conclude that GPTSniffer can be leveraged in different contexts, e.g., in software engineering education, where teachers use the tool to detect cheating and plagiarism, or in development, where AI-generated code may require peculiar quality assurance activities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

GPTSniffer: A CodeBERT-based classifier to detect source code written by ChatGPT

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software

Lead the way for us

Journal: Journal of Systems and Software	Publication Date: Apr 16, 2024
Citations: 1

Similar Papers

Towards advancement of education in Software Engineering
Bharti Suri ... Nishtha Jatana
-
Bharti Suri, et. al.Bharti Suri ... Nishtha Jatana
01 Oct 2015
01 Oct 2015

Software Engineering Education in India: Issues and Challenges
Kirti Garg ... Vasudeva Varma
-
Kirti Garg, et. al.Kirti Garg ... Vasudeva Varma
01 Apr 2008
01 Apr 2008

A Report on Software Engineering Education Workshop (SEEW) 2014 Co-Located with Asia-Pacific Software Engineering Conference 2014
Ashish Sureka ... Masateru Tsunoda
ACM SIGSOFT Software Engineering Notes | VOL. 40
Ashish Sureka, et. al.Ashish Sureka ... Masateru Tsunoda
06 Feb 2015
ACM SIGSOFT Software Engineering Notes | VOL. 40

A case study of software engineering methods education supported by digital game-based learning: Applying the SEMAT Essence kernel in games and course projects
Joran Pieper ... Peter Forbrig
-
Joran Pieper, et. al.Joran Pieper ... Peter Forbrig
01 Apr 2017
01 Apr 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GPTSniffer: A CodeBERT-based classifier to detect source code written by ChatGPT

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software