Can Large Language Models Comprehend Code Stylometry?
Abstract
Translate Article 
Code Authorship Attribution (CAA) has several applications such as copyright disputes, plagiarism detection and criminal prosecution. Existing studies mainly focused on CAA by proposing machine learning (ML) and Deep Learning (DL) based techniques. The main limitations of ML-based techniques are (a) manual feature engineering is required to train these models and (b) they are vulnerable to adversarial attack. In this study, we initially fine-tune five Large Language Models (LLMs) for CAA and evaluate their performance. Our results show that LLMs are robust and less vulnerable compared to existing techniques in CAA task.