Abstract

Over the course of a scientific career, a large fraction of the data collected by scientific investigators turns into data at risk of becoming inaccessible to future science. Although a part of the investigators’ data is made available in manuscripts and databases, other data may remain unpublished, non-digital, on degrading or near obsolete digital media, or inadequately documented for reuse. In 2013, Integrated Earth Data Applications (IEDA) provided data rescue mini-awards to three Earth science investigators. IEDA’s user communities in geochemistry, petrology, geochronology, and marine geophysics collect long-tail data, defined as data produced by individuals and small teams for specific projects, tending to be of small volume and initially for use only by these teams, thus being less likely to be easily transferred or reused. Long-tail data are at greater risk of omission from the scientific record. The awarded projects topics were (1) Geochemical and Geochronological data on volcanic rocks from the Fiji, Izu-Bonin-Mariana arc, and Endeavor segments of the global mid-ocean ridge, (2) High-Resolution, Near-bottom Magnetic Field Data, and (3) Geochemistry of Lunar Glasses. IEDA worked closely with the awardees to create a plan for the data rescue, resulting in the registration of hundreds of samples and the entry of dozens of data and documentation files into IEDA data systems. The data were made openly accessible and citable by assigning persistent identifiers for samples and files. The mini-award program proved that a relatively small incentive combined with data facility guidance can motivate investigators to accomplish significant data rescue.

Highlights

  • Today, most data held by active Earth scientists are data at risk because they are in formats that do not permit full electronic access to the information they contain

  • This paper summarizes the three data rescue projects funded in 2013, their challenges for data rescue, and lessons learned

  • Each project investigator interacted with IEDA staff who had domain training, and together they determined the best route for producing usable data products

Read more

Summary

Introduction

Most data held by active Earth scientists are data at risk because they are in formats that do not permit full electronic access to the information they contain. Long-tail data are highly diverse in data type, collection method, and processing method For this reason, domain-specific and community-guided repositories are well-suited for serving high-quality, trusted data that is suitable for reuse, because they understand the data and their scientific meaning and application, and are responsive to community requirements and concerns IEDA was well-positioned to conduct a multi-discipline data rescue effort because it operates diverse community-driven databases and tools including data repositories (e.g. the EarthChem Library, Marine Geoscience Data System, and USAP Data Center), registries (e.g. the System for Earth Sample Registration, SESAR), and global syntheses (e.g. PetDB – The Petrological Database and the Global Multi-Resolution Topography (GMRT) synthesis). IEDA assigns persistent identifiers to files (DOIs – Digital Object Identifiers) and samples (IGSNs – International GeoSample Numbers) in order to promote unambiguous and citable identification and access to data and metadata. This paper summarizes the three data rescue projects funded in 2013, their challenges for data rescue, and lessons learned

The 2013 IEDA data rescue projects
Project 1: Sample curation
Project 2: Standardizing 35 years of evolving technology
Project 3
Common themes addressed by the IEDA data rescue projects
Lessons learned
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call