Abstract


 ‘Collections as data’ has become a core activity for libraries in recent years: it is important that we make collections available in machine-readable formats to enable and encourage computational research. However, while this is a necessary output, discussion around the processes and workflows required to turn collections into data, and to make collections data available openly, are just as valuable. With libraries increasingly becoming producers of their own collections – presenting data from digitisation and digital production tools as part of datasets, for example – and making collections available at scale through mass-digitisation programmes, the trustworthiness of our processes comes into question. In a world of big data, often of unclear origins, how can libraries be transparent about the ways in which collections are turned into data, how do we ensure that biases in our collections are recognised and not amplified, and how do we make these datasets available openly for reuse? This paper presents a case study of work underway at the National Library of Scotland to present collections as data in an open and transparent way – from establishing a new Digital Scholarship Service, to workflows and online presentation of datasets. It considers the changes to existing processes needed to produce the Data Foundry, the National Library of Scotland's open data delivery platform, and explores the practical challenges of presenting collections as data online in an open, transparent and coherent manner.

Highlights

  • In 2017, Thomas Padilla wrote of a ‘collections as data imperative’ for libraries and cultural heritage organisations, which focused on three key concepts: generativity, legibility and creativity (Padilla, 2017, p. 2)

  • While material provenance has always been an important topic for libraries and field of book history, this paper explores why this documentation and transparency is relevant to current library practices, around digitisation and data release, and how steps can be taken to make this information available to users

  • A similar Twitter ‘thread’ was published to explain Handwritten Text Recognition (HTR) technology to a public audience, to mark the release of the Library’s first artificial intelligence (AI)-generated dataset using the Transkribus platform (National Library of Scotland, 2021), enabling the Library to communicate the human involvement in AI-generated work, and the problems with this

Read more

Summary

Introduction

In 2017, Thomas Padilla wrote of a ‘collections as data imperative’ for libraries and cultural heritage organisations, which focused on three key concepts: generativity, legibility and creativity (Padilla, 2017, p. 2). The International GLAM Labs Community, for example, advocates for ‘l­aboratory’-style innovative experimentation with digital collections, and Digital Scholarship teams, services and roles are familiar in US organisations, and increasingly becoming a core part of European research libraries (such as the British Library) and beyond Within this broader context of ‘collections as data’ activity, this case study, based on a presentation delivered at the LIBER2020 conference (Ames, 2020a), explores how the Library’s open data-delivery platform, the Data Foundry, has been designed to include data provenance; how the Library’s Digital Scholarship Service works to embed transparency into the Library’s processes; and the practical implications, benefits and challenges of this activity. It recognises that libraries are increasingly becoming data producers – and a producer of their own collections – and that this is problematic, and explores the ways in which the National Library of Scotland is approaching this issue

Launching a Digital Scholarship Service
Embedding Transparency in Library Practices
Communicating Transparency
Value of Clarity in Online Presentation of Data
Data Provenance
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.