This paper summarizes some recent experience in analyzing and eliminating sources of error in the design phase of large software projects. It begins by pointing out some of the significant differences in software error incidence between large and small software projects. The most striking contrast, illustrated by project data, is the large preponderance of design errors over coding errors on large-scale projects, not only with respect to numbers of errors, but also with respect to the relative time and effort required to detect them and correct them. The paper next presents a taxonomy of software error causes, and some analyses of the design error data, performed to obtain a better understanding of the nature of large-scale software design errors and to evaluate alternative methods of preventing, detecting, and eliminating them. Based on this analysis of observational data, a hypothesis was derived regarding the potential cost-effectiveness of an automated aid to detecting inconsistencies between assertions about the nature of inputs and outputs of the various elements (functions, modules, data bases, data sources, etc.) of the software design. This hypothesis was tested by developing a prototype version of such an aid, the Design Assertion Consistency Checker (DACC), using TRW's Generalized Information Management (GIM) system, and using it on a large-scale software project with 186 elements and 967 assertions about their inputs and outputs. Of the 121,000 possible mismatches between input and output assertions, DACC found 818, at a cost in computer time of $30. Most of the mismatches resulted from shortfalls in the initial version of DACC or the initial data preparation, such as a lack of a synonym capability and a lack of a explicit statements about external inputs and outputs. However, a number of serious mismatches were exposed at a time when they were easy to correct, and a most useful work-list generated of items needing resolution before allowing the design effort to proceed to further detail. In general, the data confirmed the hypothesis about the general utility of a DACC capability for large software projects. However, a number of additional features should be considered to compensate for current deficiencies (in areas such as manuscript preparation) and to fully take advantage of having the software design in machine-readable form.
Read full abstract