Existing software clustering techniques tend to ignore prior knowledge from domain experts, leading to results (suggested big-bang remodularization actions) that cannot be acceptable to developers. Incorporating domain experts knowledge or constraints during clustering ensures the obtained modularization aligns with developers’ perspectives, enhancing software quality. However, manual review by knowledgeable domain experts for constraint generation is time-consuming and labor-intensive. In this article, we propose an evolution-aware constraint derivation approach, Escort , which automatically derives clustering constraints based on the evolutionary history from the analyzed software. Specifically, Escort can serve as an alternative approach to derive implicit and explicit constraints in situations where domain experts are absent. In the subsequent constrained clustering process, Escort can be considered as a framework to help supplement and enhance various unconstrained clustering techniques to improve their accuracy and reliability. We evaluate Escort based on both quantitative and qualitative analysis. In quantitative validation, Escort , using generated clustering constraints, outperforms seven classic unconstrained clustering techniques. Qualitatively, a survey with developers from five IT companies indicates that 89% agree with Escort ’s clustering constraints. We also evaluate the utility of refactoring suggestions from our constrained clustering approach, with 54% acknowledged by project developers, either implemented or planned for future releases.
Read full abstract