Detecting Bugs by Discovering Expectations and Their Violations

Pan Bian,Chaoqun Yang,Bin Liang,Yan Cai,Wenchang Shi,Yan Zhang

doi:10.1109/tse.2018.2816639

Abstract

Code mining has been proven to be a promising approach to inferring implicit programming rules for finding software bugs. However, existing methods may report large numbers of false positives and false negatives. In this paper, we propose a novel approach called EAntMiner to improve the effectiveness of code mining. EAntMiner elaborately reduces noises from statements irrelevant to interesting rules and different implementation forms of the same logic. During preprocessing, we employ program slicing to decompose the original source repository into independent sub-repositories. In each sub-repository, statements irrelevant to critical operations (automatically extracted from source code) are excluded and various semantics-equivalent implementations are normalized into a canonical form as far as possible. Moreover, to tackle the challenge that some bugs are difficult to be detected by mining frequent patterns as rules, we further developed a kNN-based method to identify them. We have implemented EAntMiner and evaluated it on four large-scale C systems. EAntMiner successfully detected 105 previously unknown bugs that have been confirmed by corresponding development communities. A set of comparative evaluations also demonstrate that EAntMiner can effectively improve the precision of code mining.

Full Text