Source code complexity of legacy object-oriented (OO) software has a trickle-down effect over the key activities of software development and maintenance. Package-based OO design is widely believed to be an effective modularization. Recently, theories and methodologies have been proposed to assess the complementary aspects of legacy OO systems through package-modularization metrics. These package-modularization metrics basically address non-API-based object-oriented principles, like encapsulation, commonality-of-goal, changeability, maintainability, and analyzability. Despite their ability to characterize package organization, their application towards cost-effective fault-proneness prediction is yet to be determined. In this paper, we present theoretical illustration and empirical perspective of non-API-based package-modularization metrics towards effort-aware fault-proneness prediction. First, we employ correlation analysis to evaluate the relationship between faults and package-level metrics. Second, we use multivariate logistic regression with effort-aware performance indicators (ranking and classification) to investigate the practical application of proposed metrics. Our experimental analysis over open-source Java software systems provides statistical evidence for fault-proneness prediction and relatively better explanatory power than traditional metrics. Consequently, these results guide developers for reliable and modular package-based software design.
Read full abstract