Abstract

Code mining has been proven to be a promising approach to inferring implicit programming rules for finding software bugs. However, existing methods may report large numbers of false positives and false negatives. In this paper, we propose a novel approach called EAntMiner to improve the effectiveness of code mining. EAntMiner elaborately reduces noises from statements irrelevant to interesting rules and different implementation forms of the same logic. During preprocessing, we employ program slicing to decompose the original source repository into independent sub-repositories. In each sub-repository, statements irrelevant to critical operations (automatically extracted from source code) are excluded and various semantics-equivalent implementations are normalized into a canonical form as far as possible. Moreover, to tackle the challenge that some bugs are difficult to be detected by mining frequent patterns as rules, we further developed a kNN-based method to identify them. We have implemented EAntMiner and evaluated it on four large-scale C systems. EAntMiner successfully detected 105 previously unknown bugs that have been confirmed by corresponding development communities. A set of comparative evaluations also demonstrate that EAntMiner can effectively improve the precision of code mining.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.