TCCCD: Triplet-Based Cross-Language Code Clone Detection

Yong Fang,Zhonglin Liu,Yijia Xu,Fangzheng Zhou

doi:10.3390/app132112084

Abstract

Code cloning is a common practice in software development, where developers reuse existing code to accelerate programming speed and enhance work efficiency. Existing clone-detection methods mainly focus on code clones within a single programming language. To address the challenge of code clone instances in cross-platform development, we propose a novel method called TCCCD, which stands for Triplet-Based Cross-Language Code Clone Detection. Our approach is based on machine learning and can accurately detect code clone instances between different programming languages. We used the pre-trained model UniXcoder to map programs written in different languages into the same vector space and learn their code representations. Then, we fine-tuned TCCCD using triplet learning to improve its effectiveness in cross-language clone detection. To assess the effectiveness of our proposed approach, we conducted thorough comparative experiments using the dataset provided by the paper titled CLCDSA (Cross Language Code Clone Detection using Syntactical Features and API Documentation). The experimental results demonstrated a significant improvement of our approach over the state-of-the-art baselines, with precision, recall, and F1-measure scores of 0.96, 0.91, and 0.93, respectively. In summary, we propose a novel cross-language code-clone-detection method called TCCCD. TCCCD leverages the pre-trained model UniXcode for source code representation and fine-tunes the model using triplet learning. In the experimental results, TCCCD outperformed the state-of-the-art baselines in terms of the precision, recall, and F1-measure.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Nov 6, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

TCCCD: Triplet-Based Cross-Language Code Clone Detection

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Java Code Clone Detection by Exploiting Semantic and Syntax Information From Intermediate Code-Based Graph
Dawei Yuan ... Tao Zhang
IEEE Transactions on Reliability | VOL. 72
Dawei Yuan, et. al.Dawei Yuan ... Tao Zhang
01 Jun 2023
IEEE Transactions on Reliability | VOL. 72

Combining Holistic Source Code Representation with Siamese Neural Networks for Detecting Code Clones
Smit Patel ... Roopak Sinha
-
Smit Patel, et. al.Smit Patel ... Roopak Sinha
01 Jan 2021
01 Jan 2021

Parallel and Distributed Code Clone Detection using Sequential Pattern Mining
Ali El-Matarawy ... Reem Bahgat
International Journal of Computer Applications | VOL. 62
Ali El-Matarawy, et. al.Ali El-Matarawy ... Reem Bahgat
18 Jan 2013
International Journal of Computer Applications | VOL. 62

Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree
Wenhan Wang ... Zhi Jin
-
Wenhan Wang, et. al.Wenhan Wang ... Zhi Jin
01 Feb 2020
01 Feb 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TCCCD: Triplet-Based Cross-Language Code Clone Detection

Abstract

Talk to us

Similar Papers

More From: Applied Sciences