Abstract

The focus of this article is to share several in-progress research and development open-source approaches that seek to design, build, and test digital curation services and repositories that have the potential to scale (the IMLS-funded Fedora DRAS-TIC and the NSF-funded Brown Dog). We also discuss the creation of a big records testbed of justice, human rights, and cultural heritage collections (100 TB and 100 million records), the emergence of Computational Archival Science (CAS), and the resulting efforts at integrating digital curation education and research. We ultimately seek to develop a sustainable community of users and developers, with solutions that serve the international library, archives, and scientific data management communities. We are also focused on digital curation training and education in these innovative environments.

Highlights

  • We present two approaches to the design and curation of digital repositories that exploit current technology to address the emerging issues of capacity scaling, heterogeneous content, and sustainability: The first approach exploits NoSQL distributed database technology to support repositories that can scale out horizontally to thousands of commodity servers

  • This was recently funded through a U.S Institute of Museum and Library Services (IMLS) grant, called DRAS-TIC Fedora1, as part of IMLS’s National Digital Platform (NDP) program

  • Brown Dog is a $10.5M National Science Foundation (NSF)/DIBBs-funded collaboration with the University of Illinois National Center for Supercomputing Applications (NCSA) Supercomputing Center and industry partners (NetApp and Archive Analytics Solutions). This project aims to help accelerate the development of digital curation processes and services and create a data observatory to provide access to Big Records training sets and teach students practical digital curation skills

Read more

Summary

Introduction

Brown Dog is a $10.5M NSF/DIBBs-funded collaboration with the University of Illinois NCSA Supercomputing Center and industry partners (NetApp and Archive Analytics Solutions) This project aims to help accelerate the development of digital curation processes and services and create a data observatory to provide access to Big Records training sets and teach students practical digital curation skills. Human rights, and cultural heritage themes (community displacement, racial zoning, refugee narrative, citizen narrative, movement of people, and revealing untold stories) and cyberinfrastructure for the curation and management of digital assets at scale themes (preservation services in the cloud, and scalable distributed repositories) These projects are supported by the development of the DRAS-TIC open-source software which currently manages 100 million files and 100TB of cultural heritage data. It explores eight topics: 1) Evolutionary prototyping and computational linguistics (Bill Underwood), 2) Graph analytics, digital humanities and archival representation (Richard Marciano), 3) Computational finding aids (Greg Jansen), 4) Digital curation (Michael Kurtz), 5) Public engagement with (archival) content (Mark Hedges), 6) Authenticity (Victoria Lemieux), 7) Confluences between archival theory and computational methods (Maria Esteva), and 8) Spatial and temporal analytics (Mark Conrad)

Computational Methods
Curation and Appraisal
Creation and Management of Current Records

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.