Abstract
This article discusses structural, systems, and other types of bias that arise in matching new records to large databases. The focus is databases for bibliographic utilities, but other related database concerns will be discussed. Problems of satisfying a “match” with sufficient flexibility and rigor in an environment of imperfect data are presented, and sources of unintentional variance are discussed.
Highlights
This article discusses structural, systems, and other types of bias that arise in matching new records to large databases
Computerized records in a net worked environment have encouraged the recognition that duplicate records pose a serious threat to efficient information retrieval
Matching is defined as the process by which additions to a large database are screened and compared with existing database records
Summary
MARC records frequently are a mixture of languages. As has been seen in other projects with intensive comparison of text, overlap in languages has the potential to confuse comparisons of short strings of text.[21]. The cataloging rules for describing formats have changed. 6. Even within a country, standards change over time, so that “correct” cataloging in one decade may not match that in a later period. Records themselves change over time as they are copied, derived, and migrated into other systems They may be enhanced or corrected in any system where they reside. When they return to the origi nating database, they may have been transformed so far as to be unrecognizable as representations of the same item. This problem is not unique to XWC; it is a challenge for any shared database where export of records and reentry is likely
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.