Abstract
Abstract Component-based-software-development (CBSD) is one of the most recent trends in the software development industry and its success majorly depends on the quality of the software components. Good quality software components are those components, which are internally strongly cohesive and are independent of others. Use of such components helps in faster development of a software and reduces the maintenance efforts in future. Such components can be identified from different software repositories and can be reused whenever needed. Hence, proper identification of such reusable components is a promising area of research, and the same has been targeted in this paper. This paper model Reusable Software Component Identification (RSCI) problem as a search-based multi-objective problem in order to identify optimized reusable software components from the object-oriented (OO) source code of a software system. For OO software paradigm, we consider the component as an individual class or group of connected classes which can be reused with least modifications. To identify this, three types of relationship are proposed in this paper, namely (1) Frequent Usage Pattern (FUP) based cohesion (2) semantic relatedness based cohesion and (3) co-change based coupling. The proposed approach optimizes by simultaneously maximizing both types of cohesion and minimizing the coupling of a given component. The FUP based cohesion maximization is based on the newly proposed cohesion metric called PatternCohesion which is computed using FUP information extracted from a given software element (class/ interface in an object-oriented language). The FUP of a class comprises of those member variables of all classes of the software, which are directly or indirectly accessed by member functions defined in the class under consideration. The PatternCohesion metric helps to measure the functional relatedness among different pairs of software elements (classes). The semantic relatedness based cohesion maximization is based on TF-IDF based Cosine Similarity measurement using tokens extracted from three main parts of the source code of a given software element namely class/ interface declaration statement, data types of different member variables and member function signatures. The co-change based coupling is computed from the change-history of the underlying software system using well-known data mining metric called Support. Finally, reusable components are identified as different cohesive groups (consisting of one or more connected classes) using NSGA-III Algorithm. The proposed approach is empirically evaluated over six open source software systems belonging to different domains. The need of considering all three types of relations is established by comparing their performance with relations taken individually and two at a time. The obtained results indicate a higher quality for different software components measured in terms of reusability characteristics.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have