Abstract

Ongoing stewardship is required to keep data collections and archives in existence. Scientific data collections may face a range of risk factors that could hinder, constrain, or limit current or future data use. Identifying such risk factors to data use is a key step in preventing or minimizing data loss. This paper presents an analysis of data risk factors that scientific data collections may face, and a data risk assessment matrix to support data risk assessments to help ameliorate those risks. The goals of this work are to inform and enable effective data risk assessment by: a) individuals and organizations who manage data collections, and b) individuals and organizations who want to help to reduce the risks associated with data preservation and stewardship. The data risk assessment framework presented in this paper provides a platform from which risk assessments can begin, and a reference point for discussions of data stewardship resource allocations and priorities.

Highlights

  • At1 the “The Rescue of Data At Risk” workshop held in Boulder, Colorado on September 8th and 9th, 2016,2 participants were asked the following question: “How would you define ‘at risk’ data?” Discussions on this point ranged widely, and touched on several challenges, including lack of funding or personnel support for data management, natural and political disasters, and metadata loss

  • This paper presents an analysis of data risk factors that scientific data collections and archives may face, and a matrix to support data risk assessments to help ameliorate those risks

  • The analysis presented in this paper builds on prior work in a number of areas: 1) research on data risks, 2) data rescue initiatives within government agencies & specific disciplines, 3) Committee on Data (CODATA) and Research Data Alliance (RDA) working groups & meetings, 4) trusted repository certifications, and 5) knowledge and experience of the Earth Science Information Partners (ESIP) Data Stewardship Committee members

Read more

Summary

Introduction

At1 the “The Rescue of Data At Risk” workshop held in Boulder, Colorado on September 8th and 9th, 2016,2 participants were asked the following question: “How would you define ‘at risk’ data?” Discussions on this point ranged widely, and touched on several challenges, including lack of funding or personnel support for data management, natural and political disasters, and metadata loss. Many risks can be reduced or eliminated by following best practices codified as certifications and guidelines, such as the CoreTrustSeal Data Repository Certification (2018) and the ISO 16363:2012 This ISO standard defines audit and certification procedures for trustworthy digital repositories (ISO 2012b). A number of loosely organized and coordinated efforts were initiated to duplicate data from US government organizations to prevent potential politically motivated data deletion or obfuscation (See for example Dennis 2016; Varinsky 2017) In many cases, these initiatives focused on duplicating government-hosted Earth science data. ESIP Data Stewardship Committee members wrote a white paper to provide the Earth science data centers’ perspective on these grass-roots “data rescue” activities (Mayernik et al 2017). The white paper provided suggestions for how the grass-roots efforts might productively engage with the data centers themselves

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call