IDENTIFYING REUSABLE SOFTWARE COMPONENTS BY INDUCTION

Juan Carlos Esteva,Robert G Reynolds

doi:10.1142/s0218194091000202

Abstract

The goal of the Partial Metrics Project is the automatic acquisition of planning knowledge from target code modules in a program library. In the current prototype the system is given a target code module written in Ada as input, and the result is a sequence of generalized transformations that can be used to design a class of related modules. This is accomplished by embedding techniques from Artificial Intelligence into the traditional structure of a compiler. The compiler performs compilation in reverse, starting with detailed code and producing an abstract description of it. The principal task facing the compiler is to find a decomposition of the target code into a collection of syntactic components that are nearly decomposable. Here, nearly decomposable corresponds to the need for each code segment to be nearly independent syntactically from the others. The most independent segments are then the target of the code generalization process. This process can be described as a form of chunking and is implemented here in terms of explanation-based learning. The problem of producing nearly decomposable code components becomes difficult when target code module is not well structured. The task facing users of the system is to be able to identify well-structured code modules from a library of modules that are suitable for input to the system. In this paper we describe the use of inductive learning techniques, namely variations on Quinlan's ID3 system that are capable of producing a decision tree that can be used to conceptually distinguish between well poorly structured code. In order to accomplish that task a set of high-level concepts used by software engineers to characterize structurally understandable code were identified. Next, each of these concepts was operationalized in terms of code complexity metrics that can be easily calculated during the compilation process. These metrics are related to various aspects of the program structure including its coupling, cohesion, data structure, control structure, and documentation. Each candidate module was then described in terms of a collection of such metrics. Using a training set of positive and negative examples of well-structured modules, each described in terms of the appointed metrics, a decision tree was produced that was used to recognize other well-structured modules in terms of their metric properties. This approach was applied to modules from existing software libraries in a variety of domains such as database, editor, graphic, window, data processing, FFT and computer vision software. The results achieved by the system were then benchmarked against the performance of experienced programmers in terms of recognizing well structured code. In a test case involving 120 modules, the system was able to discriminate between poor and well-structured code 99% of the time as compared to an 80% average for the 52 programmers sampled. The results suggest that such an inductive system can serve as a practical mechanism for effectively identifying reusable code modules in terms of their structural properties.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

IDENTIFYING REUSABLE SOFTWARE COMPONENTS BY INDUCTION

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering

Lead the way for us

Journal: International Journal of Software Engineering and Knowledge Engineering	Publication Date: Sep 1, 1991
Citations: 13

Similar Papers

<title>Learning to recognize reusable software by induction</title>
Juan C Esteva ... Robert G Reynolds
-
Juan C Esteva, et. al.Juan C Esteva ... Robert G Reynolds
01 Jan 1990
01 Jan 1990

Learning to recognize reusable software modules using an inductive classification system
J.C Esteva
-
J.C EstevaJ.C Esteva
01 Sep 1990
01 Sep 1990

Program Comprehension and Code Complexity Metrics: An fMRI Study
Norman Peitek ... Janet Siegmund
-
Norman Peitek, et. al.Norman Peitek ... Janet Siegmund
01 May 2021
01 May 2021

Program Comprehension and Code Complexity Metrics: A Replication Package of an fMRI Study
Norman Peitek ... Sven Apel
-
Norman Peitek, et. al.Norman Peitek ... Sven Apel
01 May 2021
01 May 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IDENTIFYING REUSABLE SOFTWARE COMPONENTS BY INDUCTION

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering