Software systems’ concrete architecture often drifts from the intended architecture throughout their evolution. Program comprehension activities, like software architecture recovery, become very demanding, especially for large and complex systems due to the existence of noise, which is created by omnipresent and utility classes that obscure the system structure. Omnipresent classes represent crosscutting concerns, utilities or elementary domain concepts. The identification and filtering of noise is a necessary preprocessing step before attempting program comprehension techniques, especially for undocumented systems. In this paper, we propose an automated methodology for noise identification. Our methodology is based on the notion that noisy classes are widely used in a system, directly or indirectly. We combine classes’ usage significance with their participation in the system’s subgraphs, in order to identify the classes that are persistently used. Usage significance is measured according to Component Rank, a well-established metric in the literature, which ranks software artifacts according to their usage significance. The experimental results show that the proposed methodology successfully captures classes that produce noise and improves the results of existing algorithms for software systems’ architectural decomposition.
Read full abstract