Abstract
The focus of this article is to share several in-progress research and development open-source approaches that seek to design, build, and test digital curation services and repositories that have the potential to scale (the IMLS-funded Fedora DRAS-TIC and the NSF-funded Brown Dog). We also discuss the creation of a big records testbed of justice, human rights, and cultural heritage collections (100 TB and 100 million records), the emergence of Computational Archival Science (CAS), and the resulting efforts at integrating digital curation education and research. We ultimately seek to develop a sustainable community of users and developers, with solutions that serve the international library, archives, and scientific data management communities. We are also focused on digital curation training and education in these innovative environments.
Highlights
We present two approaches to the design and curation of digital repositories that exploit current technology to address the emerging issues of capacity scaling, heterogeneous content, and sustainability: The first approach exploits NoSQL distributed database technology to support repositories that can scale out horizontally to thousands of commodity servers
This was recently funded through a U.S Institute of Museum and Library Services (IMLS) grant, called DRAS-TIC Fedora1, as part of IMLS’s National Digital Platform (NDP) program
Brown Dog is a $10.5M National Science Foundation (NSF)/DIBBs-funded collaboration with the University of Illinois National Center for Supercomputing Applications (NCSA) Supercomputing Center and industry partners (NetApp and Archive Analytics Solutions). This project aims to help accelerate the development of digital curation processes and services and create a data observatory to provide access to Big Records training sets and teach students practical digital curation skills
Summary
Brown Dog is a $10.5M NSF/DIBBs-funded collaboration with the University of Illinois NCSA Supercomputing Center and industry partners (NetApp and Archive Analytics Solutions) This project aims to help accelerate the development of digital curation processes and services and create a data observatory to provide access to Big Records training sets and teach students practical digital curation skills. Human rights, and cultural heritage themes (community displacement, racial zoning, refugee narrative, citizen narrative, movement of people, and revealing untold stories) and cyberinfrastructure for the curation and management of digital assets at scale themes (preservation services in the cloud, and scalable distributed repositories) These projects are supported by the development of the DRAS-TIC open-source software which currently manages 100 million files and 100TB of cultural heritage data. It explores eight topics: 1) Evolutionary prototyping and computational linguistics (Bill Underwood), 2) Graph analytics, digital humanities and archival representation (Richard Marciano), 3) Computational finding aids (Greg Jansen), 4) Digital curation (Michael Kurtz), 5) Public engagement with (archival) content (Mark Hedges), 6) Authenticity (Victoria Lemieux), 7) Confluences between archival theory and computational methods (Maria Esteva), and 8) Spatial and temporal analytics (Mark Conrad)
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.