Масштабируемый инструмент поиска клонов кода на основе семантического анализа программ

Sevak Sargsyan,Artiom Baloian,Shamil Kurmnagaleev,Hayk Aslanyan,Andrey Belevantsev

doi:10.15514/ispras-2015-27(1)-3

Abstract

This article describes the methods of code clones detection. New approach of code clones detection is proposed for C/C++ languages based on analysis of existed methods. The method based on semantic analysis of the project, which allows detecting code clones with high accuracy. It is realized as part of LLVM compiler, which allows exceeding existed methods. The tool is consisted of three basic parts. The first part is Program Dependence Graph (PDG) generation and serialization. PDG is constructed during compilation time of the project based on LLVM‘s intermediate representation. Several simple optimizations are applied on these graphs, then they are serialized to file. The second stage is analyzing of stored PDGs. PDGs are loaded from files and split to subgraphs. Every subgraph is considered as clone candidate. New method is purposed for the splitting, which increases number of detected clones. There are two types of algorithms for clone detection. The first types of algorithms try to prove that the pair of PDGs cannot be clones. These algorithms have linear complexity, which allows processing huge amount of PDGs pairs. In case of failure graph isomorphism algorithms are applied for similar subgraphs detection. The last part is integrated system for automatic testing of algorithm’s accuracy. For the project, set of clones are automatically generated, then clone detection algorithms are applied for original source and generated one.

Highlights

ВведениеПовторное использование фрагментов исходного кода часто встречается при разработке программного обеспечения (ПО)
Scalable code clone detection tool based on semantic analysis
Scalable code clone detection tool based on semantic analysis*

Summary

Введение

Повторное использование фрагментов исходного кода часто встречается при разработке программного обеспечения (ПО). ISP RAS, 2015, vol 27, issue 1, pp. 39-50 и дальнейшего изменения некоторого участка кода может получить желаемый результат. Клоны могут возникнуть не только в результате копирования неких участков кода. Исследования показали, что до 20 процентов исходного кода могут являться клонами [1, 2]. Потребность в поиске клонов кода возникает при поиске функционально похожих частей программы в бинарном или исходном коде программ, при решении задач автоматического рефакторинга, поиска семантических ошибок, возникающих при некорректном копировании участков кода. В данной работе будет описана архитектура инструмента для поиска клонов кода, который обладает высокой точностью и масштабируема. Благодаря ряду новых техник и алгоритмов стало возможно анализировать миллионы строк исходного кода.

Типы клонов

Текстовой подход

Лексический подход

Синтаксический подход

Семантический подход

Сравнения подходов

Модель инструмента поиска клонов кода

Генерация PDG

Анализ PDG в целях нахождения клонов кода

Разделения PDG на подграфы

Поиск клонов

Фильтрация

Заключение

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proceedings of the Institute for System Programming of the RAS	Publication Date: Jan 1, 2015
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Масштабируемый инструмент поиска клонов кода на основе семантического анализа программ

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the Institute for System Programming of the RAS

Lead the way for us

Similar Papers

Parallel and Distributed Code Clone Detection using Sequential Pattern Mining
Ali El-Matarawy ... Reem Bahgat
International Journal of Computer Applications | VOL. 62
Ali El-Matarawy, et. al.Ali El-Matarawy ... Reem Bahgat
18 Jan 2013
International Journal of Computer Applications | VOL. 62

To enhance the code clone detection algorithm by using hybrid approach for detection of code clones
Roopam ... Gurpreet Singh
-
Roopam, et. al. Roopam ... Gurpreet Singh
01 Jun 2017
01 Jun 2017

Semantic code clone detection using abstract memory states and program dependency graphs
Hamid Nasirloo ... Fatemeh Azimzadeh
-
Hamid Nasirloo, et. al.Hamid Nasirloo ... Fatemeh Azimzadeh
01 Apr 2018
01 Apr 2018

Scalable and accurate detection of code clones
S Sargsyan ... A Belevantsev
Programming and Computer Software | VOL. 42
S Sargsyan, et. al.S Sargsyan ... A Belevantsev
01 Jan 2015
Programming and Computer Software | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Масштабируемый инструмент поиска клонов кода на основе семантического анализа программ

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the Institute for System Programming of the RAS