Abstract

With the development of the Internet and the construction of open source software communities, there has been a surge in open source software. Code Reuse—copy-past and modify open source code, which becomes a convenient choice for developers to save time and reduce labor costs. So there are more and more similar code fragments, code clones, in code project as a popular phenomenon. The code clone may import uncertainties into the program, which is a hot spot for urgent exploration. This paper summarized code clone detection tools and techniques in four categories at present and introduced one detection tool, NiCad, with high recall and precision. However, NiCad is not perfect for large-scale code clone detection scenarios, because NiCad is slow when dealing with large-scale of codes. Therefore, we speeded the detection process of NiCad, and and named the improved tool NiCad+. We greatly improved the efficiency of NiCad without effecting its recall and precision. The time-cost of detecting code clone was remarkable shortened by reducing the matching times. When testing with BigCloneEval, it only takes 28.43% time-cost as original NiCad. When testing with varying input sizes, the speeded detection process performs better than the original one from 10 KLoC (lines of code) to 5 MLoC.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call