CADRE: A Collaborative, Cloud-Based Solution for Big Bibliographic Data Research in Academic Libraries.

Patricia L Mabry,Stephanie Hernandez Mcgavin,Robert Van Rennes,Valentin Pentchev,Jamie V Wittenberg,Xiaoran Yan

doi:10.3389/fdata.2020.556282

Abstract

Big bibliographic datasets hold promise for revolutionizing the scientific enterprise when combined with state-of-the-science computational capabilities. Yet, hosting proprietary and open big bibliographic datasets poses significant difficulties for libraries, both large and small. Libraries face significant barriers to hosting such assets, including cost and expertise, which has limited their ability to provide stewardship for big datasets, and thus has hampered researchers' access to them. What is needed is a solution to address the libraries' and researchers’ joint needs. This article outlines the theoretical framework that underpins the Collaborative Archive and Data Research Environment project. We recommend a shared cloud-based infrastructure to address this need built on five pillars: 1) Community–a community of libraries and industry partners who support and maintain the platform and a community of researchers who use it; 2) Access–the sharing platform should be accessible and affordable to both proprietary data customers and the general public; 3) Data-Centric–the platform is optimized for efficient and high-quality bibliographic data services, satisfying diverse data needs; 4) Reproducibility–the platform should be designed to foster and encourage reproducible research; 5) Empowerment—the platform should empower researchers to perform big data analytics on the hosted datasets. In this article, we describe the many facets of the problem faced by American academic libraries and researchers wanting to work with big datasets. We propose a practical solution based on the five pillars: The Collaborative Archive and Data Research Environment. Finally, we address potential barriers to implementing this solution and strategies for overcoming them.

Highlights

THE RISE OF BIG BIBLIOGRAPHIC DATASETS IN RESEARCH AND HOW LIBRARIES STRUGGLE TO MEET DEMANDSBig bibliographic datasets hold promise for revolutionizing the scientific enterprise when combined with state-of-the-science computational capabilities (Fortunato et al, 2018)
Acquiring data for continued research use depends on Collaborative Archive & Data Research Environment (CADRE): A Big Bibliographic Data Solution a technical and legal framework that has been long-established in research libraries (Li et al, 2019)
We suggest that the answer may lie in CADREa a cloud-based platform for text and data mining, which could provide sustainable, scalable, and standardized data and analytic services for open and proprietary big bibliographic datasets

Summary

INTRODUCTION

THE RISE OF BIG BIBLIOGRAPHIC DATASETS IN RESEARCH AND HOW LIBRARIES STRUGGLE TO MEET DEMANDS. The user base for these datasets is limited to individual researchers or large and well-funded academic libraries with the resources and technical expertise to host big data. This is true for both proprietary and open bibliographic data. Existing local and national efforts to build infrastructure in support of data-intensive research, like XSEDE,h often do not offer services or architecture that accounts for the nuanced, disparate licensing requirements of many large, proprietary datasets This key feature of the CADRE solution differentiates it from similar platforms. Access to bibliographic and other text-based datasets would enable academic libraries with appropriate subscriptions to provide a high level of analysis service to data-intensive researchers without the requirements of local expertize or infrastructure. Utilizing native cloud visualization systems could ensure users are given an easy and integrated way to visualize results.k jIntegrated Jupyter and Databricks notebooks could provide access to resources like Spark, R, Python, Scala, or SQL, SPARQL, Cypher and Gremlin. kQuickSight and Power BI are two cloud-based visualization systems that could be used

DISCUSSION

DATA AVAILABILITY STATEMENT

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in big data	Publication Date: Nov 20, 2020
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

CADRE: A Collaborative, Cloud-Based Solution for Big Bibliographic Data Research in Academic Libraries.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in big data

Lead the way for us

Similar Papers

CADRE
Xiaoran Yan ... Dimitar Nikolov
-
Xiaoran Yan, et. al.Xiaoran Yan ... Dimitar Nikolov
26 Oct 2021
26 Oct 2021

What citation patterns reveal about reading research and practice in academic libraries
Keren Dali ... Lindsay Mcniff
Reference Services Review | VOL. 47
Keren Dali, et. al.Keren Dali ... Lindsay Mcniff
17 Oct 2019
Reference Services Review | VOL. 47

Academic Library Administrators Perceive Value in Their Librarians’ Research
Elaine Sullo
Evidence Based Library and Information Practice | VOL. 9
Elaine SulloElaine Sullo
09 Sep 2014
Evidence Based Library and Information Practice | VOL. 9

CCCORE: Cloud Container for Collaborative Research
Salini Suresh ... L Manjunatha Rao
-
Salini Suresh, et. al.Salini Suresh ... L Manjunatha Rao
01 Jun 2018
01 Jun 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CADRE: A Collaborative, Cloud-Based Solution for Big Bibliographic Data Research in Academic Libraries.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in big data