Abstract
Maintaining high code quality is a crucial concern in software development. Existing studies demonstrated that developers frequently face recurrent bugs and adopt similar fix measures, known as code change patterns. As an essential static analysis technique, code pattern mining supports various tasks, including code refactoring, automated program repair, and defect prediction, thus significantly improving software development processes. A prevalent approach to identifying code patterns involves translating code changes to edit actions into a Bag-of-Words (BoW) model. However, when applied to open-source projects, this method exhibits several limitations. For instance, it overlooks function call information and disregards feature word order. This study introduces MIFA, a novel technique for mining code change patterns using multiple feature analysis. MIFA extends existing BoW methods by incorporating analysis of function calls and overall changes in the Abstract Syntax Tree (AST) structure. We selected 20 popular Python projects and evaluated MIFA in both intra-project and cross-project scenarios. The experimental results indicate that: (1) MIFA achieved higher silhouette coefficients and F1 scores compared to other state-of-the-art methods, demonstrating a superior accuracy; (2) MIFA can assist developers in detecting unique change patterns more earlier, with an efficiency improvement of over 40% compared to random sampling. Additionally, we discussed critical parameters for measuring the similarity of code changes, guiding users to apply our method effectively.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have