On pattern-directed search of archives and collections

Garett O Dworman,Chuck Patch,Steven O Kimbrough

doi:10.1002/(sici)1097-4571(2000)51:1<14::aid-asi4>3.3.co;2-e

Garett O Dworman, Chuck Patch + Show 1 more

Open Access

https://doi.org/10.1002/(sici)1097-4571(2000)51:1<14::aid-asi4>3.3.co;2-e

Copy DOI

Abstract

This article begins by presenting and discussing the distinction between record-oriented and pattern-oriented search. Examples of record-oriented (or item-oriented) questions include: “What (or how many, etc.) glass items made prior to 100 A.D. do we have in our collection?” and “How many paintings featuring dogs do we have that were painted during the 19th century, and who painted them?” Standard database systems are well suited to answering such questions, based on the data in, for example, a collections management system. Examples of pattern-oriented questions include: “How does the (apparent) production of glass objects vary over time between 400 B.C. and 100 A.D.?” and “What other animals are present in paintings with dogs (painted during the 19th century and in our collection)?” Standard database systems are not well suited to answering these sorts of questions (and pattern-oriented questions in general), even though the basic data is properly stored in them. To answer pattern-oriented questions it is the accepted solution to transform the underlying (relational) data to what is called the data cube or cross tabulation form (there are other forms as well). We discuss how this can be done for non-numeric data, such as are found widely in museum collections and archives. Further we discuss and demonstrate two distinct, but related, approaches to exploring for patterns in such cross tabulated museum data. The two approaches have been implemented as the prototype systems Homer and MOTC. We conclude by discussing initial experimental evidence indicating that these approaches are indeed effective in helping people find answers to their pattern-oriented questions of museum and archive collections. Two Kinds of Questions One’s purpose, when approaching an archive or museum collection for information, might be characterized as seeking an answer to one or more questions. Thus, if an information system is to be helpful in answering one’s questions of archives and collections, it would seem that categorizing the questions to be asked can only be helpful in designing an information system to assist in answering them. What kinds of questions are there that are pertinent to archives and museum collections? This is a large and difficult issue, and we do not expect to resolve it here. Our aim in this article is more modest: we wish to distinguish two kinds of questions, and to explore their relevance to museum and archive informatics. We devote the remainder of the present section to making and exploring our basic distinction. The sections that follow explore the distinction in the context of a particular information system, the Core of Discovery system, installed at The Historic New Orleans Collection. The distinction we wish to make here, and to exploit in designing museum and archive information systems, is deeply embedded in folklore and ordinary language. “You cannot see the wood for the trees” is perhaps the earliest recorded embodiment of the distinction in English. [The quotation is from John Heywood’s Proverbs, itself the earliest published (1546) collection of English folk sayings.] Proverbially, there is a distinction to be made between seeing (or asking about) the trees and seeing (or asking about) the forest. But how can we characterize the distinction and what can we do to provide computerized support for these two kinds of questions? One question at a time. First a characterization of the distinction. The distinction is best seen through a series of examples. Let us compare some tree questions with some forest questions. Here are some questions about trees in a forest.

Full Text