Pattern Mining-Based Warning Prioritization by Refining Abstract Syntax Tree

Xiuting Ge,Xuanye Li,Yuanyuan Sun,Mingshuang Qing,Haitao Zheng,Huibin Zhang,Xianyu Wu

doi:10.1142/s0218194024500293

Abstract

Static code analysis tools (SATs) are widely used to detect potential defects in software projects. However, the usability of SATs is seriously hindered by a large number of unactionable warnings. Currently, many warning prioritization approaches are proposed to improve the usability of SATs. These approaches mainly extract different warning features to capture the statistical or historical information of warnings, thereby ranking actionable warnings in front of unactionable warnings. Such features are extracted by extremely relying on domain knowledge. However, the precise domain knowledge is difficult to be acquired. Also, the domain knowledge obtained in a project cannot be directly applied to other projects due to different application scenarios among different projects. To address the above problem, we propose a pattern mining-based warning prioritization approach based on the warning-related Abstract Syntax Tree (AST). To automatically mine actionable warning patterns, our approach leverages an advanced technique to collect actionable warnings, designs an algorithm to extract the warning-related AST, and mines patterns from ASTs of all actionable warnings. To prioritize the newly reported warnings, our approach combines exact and fuzzing matching techniques to calculate the similarity score between patterns of the newly reported warnings and the mined actionable warning patterns. We compare our approach with four typical baselines on five open-source and large-scale Java projects. The results show that our approach outperforms four baselines and achieves the maximum MAP (0.76) and MRR (2.19). Besides, a case study on Defect4J dataset demonstrates that our approach can discover 83% of true defects in the top 10 warnings.

Full Text