Clone Instances Research Articles

Code cloning is a common practice in software development, where developers reuse existing code to accelerate programming speed and enhance work efficiency. Existing clone-detection methods mainly focus on code clones within a single programming language. To address the challenge of code clone instances in cross-platform development, we propose a novel method called TCCCD, which stands for Triplet-Based Cross-Language Code Clone Detection. Our approach is based on machine learning and can accurately detect code clone instances between different programming languages. We used the pre-trained model UniXcoder to map programs written in different languages into the same vector space and learn their code representations. Then, we fine-tuned TCCCD using triplet learning to improve its effectiveness in cross-language clone detection. To assess the effectiveness of our proposed approach, we conducted thorough comparative experiments using the dataset provided by the paper titled CLCDSA (Cross Language Code Clone Detection using Syntactical Features and API Documentation). The experimental results demonstrated a significant improvement of our approach over the state-of-the-art baselines, with precision, recall, and F1-measure scores of 0.96, 0.91, and 0.93, respectively. In summary, we propose a novel cross-language code-clone-detection method called TCCCD. TCCCD leverages the pre-trained model UniXcode for source code representation and fine-tunes the model using triplet learning. In the experimental results, TCCCD outperformed the state-of-the-art baselines in terms of the precision, recall, and F1-measure.

Read full abstract

The detection of model clone has been an active research area in recent years. The closed clone instances contain all the information of model clones so they can ensure the completeness of detection results essentially. In order to improve the degree of completeness in clone detection, a novel model clone detection algorithm named CL_MCD (Closed Model Clone Detection) is proposed. CL_MCD focuses on exactly matched clones and aims to find all the closed clone instances. The main innovation of CL_MCD is in the detection phase. Every time after finding a new node pair with the same label in the breadth-first search of model graph, CL_MCD transforms all the node pairs into a clone pair, and puts the clone pair into a set that contains all the candidate clone instances if its size is greater than or equal to the size of minimum clone. Then every candidate clone instance is compared with all the others in the set. If a candidate clone instance is one part of any other instance, it is deleted. After the filtering, redundant clone instances are removed and only the closed clone instances are kept in the set. Theoretical analysis and experimental studies demonstrate that CL_MCD has higher degree of completeness than CloneDetective.

Read full abstract

Clone Instances Research Articles

Related Topics

Articles published on Clone Instances

CloneRipples: predicting change propagation between code clone instances by graph-based deep learning

TCCCD: Triplet-Based Cross-Language Code Clone Detection

A Completeness Optimized Algorithm for Closed Model Clone Detection

A Novel Optimized Path-Based Algorithm for Model Clone Detection

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Clone Instances Research Articles

Related Topics

Articles published on Clone Instances

CloneRipples: predicting change propagation between code clone instances by graph-based deep learning

TCCCD: Triplet-Based Cross-Language Code Clone Detection

A Completeness Optimized Algorithm for Closed Model Clone Detection

A Novel Optimized Path-Based Algorithm for Model Clone Detection