An important problem in software development is to make better use of software libraries by improving the search and retrieval process, that is, by making it easier to find the few components you may want among the many you do not want. This paper suggests some ideas to improve this process: (1) Associate analgebraic specification with each software component; these specifications should include complete syntactic information, but need have onlypartial semantic information. (2) User queries consist of syntactic declarations plus results forsample executions. (3) User queries may be posed in standard programming notation, which is then automatically translated into algebraic notation. (4) Search is organized asranked multi-level filtering, where each level yields aranked set of partial matches. (5) Early stages of filtering narrow the search space by using computationally simple procedures, such as checking that the number of types is adequate. (6) Middle levels may findpartial signature matches. (7) Pre-computedcatalogues (i.e., indexes) can speed up early and middle level filtering. (8) Semantic information is used in a final filter withterm rewriting, but complete verification is not attempted. (9) The series of filters is implementedincrementally, so as to backtrack to lower ranked components in case of failure. This approach avoids the need for complex theorem proving, and does not require any knowledge of algebraic specification from the user. Moreover, it does not require either specifications or queries to be complete or even fully correct, because it yields partial matches ranked by how well they fit the query. The paper concludes with a description of some preliminary experiments and some suggestions for further experiments.
Read full abstract