For example, the telegraphic abstract prepared for section 2-313 of the Uniform Commercial Code17 can be translated loosely from its machineable form into English as follows: Goods were processed by selling from a seller to a buyer; an express warranty was defined or created in that a bargain occurred from a buyer to a seller conditional upon an affirmation, a promise, a description, and a sample. In this rough paraphrase, the italicized words are those taken from the original text, and the connectives are an approximation of the meaning of the role indicator codes that were used to tag the underlined words. The analysis described up to this point is performed by individuals skilled in the literature of the field. A further analysis of the underlined words takes place via machine: the English words are looked up in the WRU Semantic Code Dictionary, and a set of codes representing various generic aspects of the original English word is substituted for the original, so that the finally coded form consists of role indicators (three-letter codes), punctuation (commas, periods, and the like), and semantic (four-letter codes). The telegraphic abstracts for all of the documents in a given file finally are maintained in coded form serially on magnetic tape, in readiness for computer search. At search time, a question is prepared, which resembles very much the form of a coded telegraphic abstract, and the file is searched for references that match (in the logical sense) the question. Some comment should be made about the facilities delivered by this unique indexing method. Although the analysis is complex by comparison with other methods, the human effort required is minimized by the restrictions imposed in the format and structure. (Only a small set of role indicators actually are used, and there is a finite number of ways in which they combine naturally; this makes it easy for the indexer to make the necessary decisions.) The format also increases the resolution of searches; that is, it decreases the possibility of false drops. On the other hand, the generic coding of the key words selected from the source document increases the system's ability to find references characterized by synonyms or terms of broader or narrower significance, and this facility of course increases the probability that all relevant documents will be found. In summary, the WRU coding system embodies a considerable refinement in analysis over most indexing systems, and its indexing depth probability is greater than that of most conventional systems. These would be desirable attributes to incorporate into an automatic indexing system. Melton & Bensing, supra note I6, at 240-43. This content downloaded from 157.55.39.255 on Mon, 01 Aug 2016 06:10:17 UTC All use subject to http://about.jstor.org/terms LAW AND CONTEMPORARY PROBLEMS E. Searching of Conventionally Indexed Documents, with Search Augmented by Statistically Associated Index Terms John C. Lyons, of the Graduate School of Public Law, George Washington University, with the participation of the Datatrol Corporation, has adapted Dr. Edmund Stiles' factor8 to the searching of files of documents dealing with antitrust problems.'9 The technique consists essentially of the following steps: i. Preparation of the File to Be Searched a) Choose (via human analysis) index terms to represent each document in the file. b) Prepare a Term Profile tape using the following procedure: Compute association factors relating each unique index in the entire file with each of the other terms with which it ever co-occurs in a document, using Stiles' modified chi-square formula, (IfN-ABjN1 In 2v/ = ASSOCIATION FACTOR AB(N-A)(N-B) J where A is the number of documents indexed by one term, B is the number of documents indexed by a second term, f is the number of documents indexed by both terms, and N is the total number of documents in the collection. Then, for each index prepare a list (the term profile) of the other terms with which the given exhibits significant association. For example, Lyons reports20 that the Clayton Act Section iI was found in this manner to be associated with the following
Read full abstract