Abstract
The goal of the Partial Metrics Project is the automatic acquisition of planning knowledge from target code modules in a program library. In the current prototype the system is given a target code module written in Ada as input, and the result is a sequence of generalized transformations that can be used to design a class of related modules. This is accomplished by embedding techniques from Artificial Intelligence into the traditional structure of a compiler. The compiler performs compilation in reverse, starting with detailed code and producing an abstract description of it. The principal task facing the compiler is to find a decomposition of the target code into a collection of syntactic components that are nearly decomposable. Here, nearly decomposable corresponds to the need for each code segment to be nearly independent syntactically from the others. The most independent segments are then the target of the code generalization process. This process can be described as a form of chunking and is implemented here in terms of explanation-based learning. The problem of producing nearly decomposable code components becomes difficult when target code module is not well structured. The task facing users of the system is to be able to identify well-structured code modules from a library of modules that are suitable for input to the system. In this paper we describe the use of inductive learning techniques, namely variations on Quinlan's ID3 system that are capable of producing a decision tree that can be used to conceptually distinguish between well poorly structured code. In order to accomplish that task a set of high-level concepts used by software engineers to characterize structurally understandable code were identified. Next, each of these concepts was operationalized in terms of code complexity metrics that can be easily calculated during the compilation process. These metrics are related to various aspects of the program structure including its coupling, cohesion, data structure, control structure, and documentation. Each candidate module was then described in terms of a collection of such metrics. Using a training set of positive and negative examples of well-structured modules, each described in terms of the appointed metrics, a decision tree was produced that was used to recognize other well-structured modules in terms of their metric properties. This approach was applied to modules from existing software libraries in a variety of domains such as database, editor, graphic, window, data processing, FFT and computer vision software. The results achieved by the system were then benchmarked against the performance of experienced programmers in terms of recognizing well structured code. In a test case involving 120 modules, the system was able to discriminate between poor and well-structured code 99% of the time as compared to an 80% average for the 52 programmers sampled. The results suggest that such an inductive system can serve as a practical mechanism for effectively identifying reusable code modules in terms of their structural properties.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Software Engineering and Knowledge Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.