Abstract

Academic libraries have a critical role to play as data quality hubs on campus. There is an increased need to ensure data quality within ‘e-science’. Given academic libraries’ curation and preservation expertise, libraries are well suited to support the data quality process. Data quality measurements are discussed, including the fundamental elements of trust, authenticity, understandability, usability and integrity, and are applied to the Digital Curation Lifecycle model to demonstrate how these measures can be used to understand and evaluate data quality within the curatorial process. Opportunities for improvement and challenges are identified as areas that are fruitful for future research and exploration.

Highlights

  • Data quality is a pressing, not to mention costly, issue in industry; a 2002 study [16] calculated that over $600 billion per year was spent on “data quality problems” [9]

  • Data quality issues have become an area of growing attention within academia and academic libraries [11, 6, 14, 12], as scientific practices evolve to exploit robust campus cyberinfrastructure and as funding agencies, such as the National Science Foundation and the National Institutes of Health, increasingly require data management plans to protect and amplify the impact of their investments

  • There are numerous examples in the literature of analog data enabling scientific inquiry decades and longer past the date it was gathered 1; how do we as a society, and we within academia, preserve this wealth of data for future science but ensure it is of high quality?

Read more

Summary

SCIENTIFIC DATA AT RISK

Data quality issues have become an area of growing attention within academia and academic libraries [11, 6, 14, 12], as scientific practices evolve to exploit robust campus cyberinfrastructure and as funding agencies, such as the National Science Foundation and the National Institutes of Health, increasingly require data management plans to protect and amplify the impact of their investments. “The survival of this data is in question since the data are not housed in long-lived institutions such as libraries. This situation threatens the underlying principles of scientific replicability since in many cases data cannot readily be collected again” [11]. There are numerous examples in the literature of analog data enabling scientific inquiry decades and longer past the date it was gathered 1; how do we as a society, and we within academia, preserve this wealth of data for future science but ensure it is of high quality?

Curatorial Practice and Challenges
MEASURING DATA QUALITY
Integrity
Usability
Understandability
Authenticity and Trust
Scaling Post-Hoc Curation
Academic Libraries as Data Quality Hubs
Helping Others to Help Us Help Others
Our Challenge

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.