Integration through intermediary system networks

Richard S Marcus

doi:10.1145/320599.320605

Abstract

Usage of online bibliographic databases has been hampered by the twin problems of (1) retrieval system complexity and (2) the heterogeneity among the many different systems and their hundreds of databases. These problems have generally limited operation of these systems to human expert information specialists who act as intermediaries for the “end users” who need the information in the online bibliographic databases. Although convenient physical access to these disparate systems is now possible through telecommunications networks, there is still considerable difficulty of logical access in such areas as identification of suitable available systems and handling login protocols. The (partial) solutions to these problems of creating a single good system or standardizing among many systems tend to be very costly and/or difficult to implement.As another means for surmounting these problems, we have experimented with a series of increasingly more sophisticated computer intermediary systems — under the generic name CONIT (for COnnector for Networked Information Transfer) — which attempt to allow computer-inexperienced end users to search these databases themselves. CONIT talks to the users in a common, easy-to-learn and easy-to-use language. It first aids the user identify databases appropriate for his problem. It then automatically connects to a system that has the given database and translates the user's request into an appropriate series of commands for that system. Responses from the retrieval system are translated back to a common format for the user. CONIT also assists the user in reformulating his search strategy in order to find more relevant documents, or fewer irrelevant documents. (Bibliographic searching generally differs from numerical data searching in its greater ambiguity and need for dynamic reformulation through interaction with the database.) CONIT interfaces users to the three main bibliographic retrieval systems: DIALOG, SDC ORBIT, and National Library of Medicine MEDLINE.In a recent series of experiments it was demonstrated that one version of CONIT succeeded in achieving retrieval effectiveness for end users who had not previously used any of the three retrieval systems which was as great in terms of numbers of relevant documents retrieved as that the same users could achieve working with human expert intermediaries on the same problems in the same retrieval systems. The online session time was longer for the end users on CONIT but the total person time was approximately the same for the two modes.Those CONIT techniques that seem to be most important in achieving these results include: (1) a simple, common command-argument request language with English-like features (pure menus are considered too restrictive in this application and natural language too confusing for the user); (2) extensive CAI for teaching the command language and for assisting users in search strategy reformulation; (3) automation of much of the “mechanics” usually required for searching: e.g., selection of and translation for the different retrieval systems (a “virtual system” concept is achieved); handling physical connection and login protocols; and remembering, clearing and regenerating retrieved sets as needed; (4) selection of only the basic, core retrieval functions for teaching to new users; and (5) a search strategy based on automatic extraction of keyword stems from a user-given natural-language topic phrase.Current and future research center on making the computer intermediary a more truly “expert” assistant which is based on a developing quantitative model of indexing and retrieval for text-based information and which would: (1) incorporate a dialog-mediated mode into the command language mode; (2) assist users to develop a conceptual formalization of their problems which will aid both the computer and the user in search strategy formulation and reformulation; and (3) estimate the number of relevant documents found and missed so far and suggest specific search strategy reformulations including estimates of incremental costs and benefits.Additional research questions include: (1) the proper emphasis and contexts for computer versus human directed control in mixed-initiative intermediary modes; (2) the appropriate distribution of intermediary functions among mainframe, mini, and micro hardware at retrieval system, network, and terminal sites (currently CONIT resides in a mainframe computer: MIT Multics); (3) the appropriate software structure for increasingly complex intermediaries (currently for CONIT we use a hybrid scheme where our own production rule interpreter is augmented by calls on PLI compiler code): and (4) the appropriate structure for expert assistants as application areas are extended; e.g., a single integrated expert or a multiplicity of coordinated experts.

Full Text