A Systematic Review of Semantic Clone Detection Techniques in Software Systems

Ajad Kumar,Kuldeep Kumar,Rashmi Yadav

doi:10.1088/1757-899x/1022/1/012074

Abstract

Code clones are repeated program structures of significant similarities occurring in a software system. In various studies, it has been found that there is at least 20–30 percent of the code which is duplicated in a software system. These code clones induce software maintenance related difficulties that might lead to bug fixing, bug propagation, etc. that seed accretion of maintenance time, effort, and cost. To achieve better efficiency, it is required to detect all types of clones, i.e., syntactic and semantic clones that exist in a software system. Detection of semantic clones is considered to be a more difficult task in comparison to the syntactic clones because it requires semantic information of the given program rather than a similar structure. To detect semantic clones, techniques based on program dependency graphs, abstract syntax trees, machine learning, deep learning, support vector machines, etc. are very popular. This paper presents a systematic and extensive analysis of clone detection tools and techniques with a special focus laid on semantic clone detection. It also highlights the pros and cons of existing clone detection techniques along with the results viz. precision, recall, etc.

Full Text