Introduction The notion of information representation and organization traditionally means creating catalogs and indexes for publications of any kind. It includes the description of the attributes of a document and the representation of its intellectual content. Libraries in the world have a long history in recording data about documents and publications; such practice can be dated back to several thousand years ago. Indexes and library catalogs are created to help users find and locate a document conveniently. Records in the information searching tools not only serve as an inventory of human knowledge and culture but also provide orderly access to the collections. Just like every other business and industry, the representation and organization of information in the network era has gone through dramatic changes in almost every stage of this process. The changes include not only the methods and technology used to create records for publications, but also the standards that are central to the success and effectiveness of these tools in searching and retrieving information. Today the library catalog is no longer a tool for its own collection for the library visitors; it has become a network node that users can visit from anywhere in the world via a computer connected to the Internet. The concept of indexing databases is no longer just for newspapers and journal articles; it has expanded into the Web information space that is being used for e-publishing, e-businesses, and e-commerce. The heart of such a universal information space lies in the standards that make it possible for different types of data to be communicated and understood by heterogeneous platforms and systems. We all know that TCP/IP allows different computer systems to talk to each other and to understand different dialects of networking language; in the world of organizing information content, the content is represented by terms either in natural or controlled language or both. The characteristics of its container (book, journal, film, memo, report, etc.) will be encoded in certain format for computer storage and retrieval. Libraries in the world have used MAchine Readable Cataloging (MARC) (Library of Congress, 1999) to encode information about their collections. In conjunction with cataloging rules, such MARC format standardized the record structure that describes information containers, i.e., books, manuscripts, maps, periodicals, motion pictures, music scores, audio/video recordings, 2-D and 3-D artifacts, and microforms. The Online Computer Library Center (OCLC) in Dublin, Ohio is the largest and the busiest cataloging service in the world. Almost 33,000 libraries from 67 countries now use OCLC products and services and more than 8,650 of them are OCLC members. As e-publishing thrives and Web information space grows, libraries have expanded conventional cataloging of their collections into organizing the information on the Web. In the early 1990s, OCLC started the Internet cataloging project, in which librarians from all types of libraries volunteered to contribute MARC records they created for Gopher servers, listserves, ftp and Web sites, and other networked information resources (OCLC, 1996). Another major undertaking in organizing information on the Web is OCLC's Metadata Initiative (Dublin Core Metadata Initiative, 1999) inaugurated in 1995, which proposed a metadata scheme containing 15 data elements. Among them are title, creator, publisher, subject, description, format, type, source, relation, identifier, and rights. The metadata scheme was named after the city where OCLC is located: Dublin Core Metadata Element Set (Dublin Core for short). Since its debut, it has become an important part of the emerging infrastructure of the Internet. Many communities are eager to adopt a common core of semantics for resource description, and the Dublin Core has attracted broad ranging international and interdisciplinary support for this purpose. …
Read full abstract