This session will focus on important technical issues when implementing digital libraries with multiple collections, different languages, and diverse audiences. A group of experts who have both research and implementation experiences of such digital libraries will join the panel to discuss the following key issues: (1) metadata strategies, (2) interoperability among various knowledge organization systems adopted by different collections, and (3) multilingual information access. Digital libraries may contain resources in many languages. Accessible through the Internet, libraries may be consulted by individuals in other cultural/linguistic “locales” seeking resources in their own languages or searching across languages for resources in languages other than their own. In order to enable the efficient and effective acquisition, storage and retrieval of cross-cultural and cross-linguistic resources, a digital library has to be designed from the outset to allow for heterogeneous linguistic and cultural content. The design process is called “internationalization.” Internationalization involves: determining the metadata elements, attributes, value spaces and values that are culturally and linguistically dependent and, if the display and interface of the library are to be localized (translated), those metadata elements that are to be rendered in multiple languages. providing a mechanism for internationalization that provides administrative control, cross-language searching capability, authority for keywords (terms), translations and translation equivalents. providing an internationalization scheme that offers reusability, scalability and interfaces with the relevant international standards. In the networked environment, cross-domain, cross-subject, and cross-language searching are common practices. However, users are often unaware of the diverse knowledge organization systems (KOS), which impede optimal retrieval. These include thesauri, subject headings lists, classification systems, and other categorization schemes used to index or organize different databases and digital collections. This presentation surveys activities and research projects aimed at achieving interoperability among KOS and analyzes the methods used in achieving interoperability. In all, 18 projects are examined and evaluated. Eight conventional and new methods that have proven to be widely accepted and promising are summarized. The fact that only about one third of 680 million online users are native English speakers, and English will still be the dominant language for the Web pose increasing demands on accessing information without language obstacles. Cross-language information retrieval (CLIR) systems are the keys to satisfy these demands because they provide the functionalities for the user to retrieve documents that satisfy her information need regardless of the language in which those documents are written. However, CLIR also poses new challenges that do not exist in monolingual settings. This presentation will concentrate on one such challenge, i.e., how to handle translation ambiguities. Due to the complexity of human languages, translation ambiguities are pervasive in cross language environments, and can greatly reduce the retrieval effectiveness if they are not handled properly. Moving beyond previous automatic disambiguation approaches, this presentation will discuss the design of the user-assisted query translation approach, which is aimed at constructing a synergic relation between the user and the retrieval system. The presenter will demonstrate the usefulness of the user-assisted approach by illustrating some research work he conducted at the University of Maryland. The presentation will conclude with discussing implications to the further development of accessing information in digital libraries.
Read full abstract