Abstract

The HathiTrust Digital Library (HTDL) is a digital library containing about 14 million volumes which comprise billions of pages of content. The HathiTrust Research Center (HTRC) is a collaborative research initiative jointly led by Indiana University and the University of Illinois at Urbana-Champaign. This paper describes the development of a collections data model by the Workset Creation for Scholarly Analysis project, a HTRC research initiative funded by the Andrew W. Mellon Foundation. The resulting HTRC Workset data model is designed to aid humanities scholars by helping them to describe selected portions of the HTDL corpus that serve as the objects of their research. The resulting worksets are persistent, citable, and can be assessed by other scholars for reuse in additional research processes.

Highlights

  • Introduction & Context The HathiTrust DigitalLibrary (HTDL) is a digital library containing 13.95 million volumes, comprising several billion pages of digitized text

  • The HathiTrust Research Center (HTRC) is a collaborative research initiative jointly based at Indiana University and the University of Illinois at Urbana-Champaign that provides support to researchers and humanities scholars who wish to exploit the HathiTrust Digital Library (HTDL)’s vast treasure trove of data

  • This article reports on the outcomes of efforts to develop formal definitions of collections, research collections, and worksets in first order logic and a basic ontology capable of capturing ­various metadata that describe them

Read more

Summary

Introduction

Introduction & Context The HathiTrust DigitalLibrary (HTDL) is a digital library containing 13.95 million volumes, comprising several billion pages of digitized text. This article reports on the outcomes of efforts to develop formal definitions of collections, research collections, and worksets in first order logic and a basic ontology capable of capturing ­various metadata that describe them. We discuss a working ontology derived from the formal definitions that fully develops a number of properties vital for distinguishing the various kinds of collections from one another.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call