Screen reading on laptops, tablets, and devices is becoming a more widely accepted form of consuming information. As this trend develops, large-scale book digitization and metadata projects are becoming an increasingly important part of the library and information world. The Internet Archive, launched in 1996, was, along with the Gutenberg Project, one of the pioneer projects aimed at making large numbers of scanned texts and other digitized media forms freely available to users. Libraries, both academic and public, have adopted the Internet Archive's Texts section as a method of providing free access to scanned book content and accompanying metadata in a variety of file formats. One new book digitization project making use of the Internet Archive is the Medical Heritage Library (MHL), a cooperative endeavor of ten institutions that is funded by grants from the Arthur P. Sloan Foundation and the National Endowment for the Humanities. Participants include the National Library of Medicine and the libraries of Columbia, Harvard, and Yale Universities. The project was conceived in 2009 and made its debut the following year. It currently contains more than 28,000 scanned items. The MHL, modeled on the highly successful Biodiversity Heritage Library, is a collection of scanned public domain books on medicine, pharmacy, nursing, and allied areas. Its curators maintain a regularly updated home page and a Facebook page featuring news about the project, images from recently released books, and links to articles concerned with the history of medicine. MHL book scans are contributed by the participating institutions and are made available for in-browser reading and file download in a dedicated section of the Internet Archive. Basic metadata for each text is included, and downloads are available in portable document format (PDF), Kindle, and a variety of other file formats. A link is provided in the record for each book to a corresponding record in the Open Library, a supplementary project to the Internet Archive's Texts section, which attempts to offer a single open and publicly editable web page for every book in existence. Readers using the browser version can search inside the text and jump to the page on which the desired word is found. Three different page display options are available: one-page, two-page, and thumbnail views. The browser reader also features a read-aloud option, which uses optical character recognition (OCR) to treat the scanned images as an audio book. In addition to searching for items, users may browse the MHL by book title or by author. Users may also browse through a list of subjects and keywords assigned to items in the collection, although the terms used in this option are not clearly identified as being drawn from Library of Congress (LC), Medical Subject Headings (MeSH), or both controlled vocabularies. The quality of the scanned images is uniformly high. Color plates and illustrations in the originals are reproduced in color. The potential for special collections librarians and others to integrate the scanned books in the MHL and similar projects into their collections and reference work is tremendous. In addition to substantially broadening the scope of materials available to the researchers at a single institution, these images may contribute to the preservation of the originals by serving as digital copies. Opportunities for textual analysis, both large and small scale, are substantial as well (the growing field of digital humanities serves as one example). These uses, and others, invite exploration by librarians.
Read full abstract