Abstract

Life sciences (LS) are advanced in research data management, since LS have established disciplinary tools for data archiving as well as metadata standards for data reuse. However, there is a lack of tools supporting the active research process in terms of data management and data analytics. This leads to tedious and demanding work to ensure that research data before and after publication are FAIR (findable, accessible, interoperable and reusable) and that analyses are reproducible. The initiative CyVerse US from the University of Arizona, US, supports all processes from data generation, management, sharing and collaboration to analytics. Within the presented project, we deployed an independent instance of CyVerse in Graz, Austria (CAT) in frame of the BioTechMed association. CAT helped to enhance and simplify collaborations between the three main universities in Graz. Presuming steps were (i) creating a distributed computational and data management architecture (iRODS-based), (ii) identifying and incorporating relevant data from researchers in LS and (iii) identifying and hosting relevant tools, including analytics software to ensure reproducible analytics using Docker technology for the researchers taking part in the initiative. This initiative supports research-related processes, including data management and analytics for LS researchers. It also holds the potential to serve other disciplines and provides potential for Austrian universities to integrate their infrastructure in the European Open Science Cloud.

Highlights

  • Over recent years, increasing numbers of gatekeepers such as funding organizations and journals demand data sharing, data management and data stewardship according to FAIR (Findability, Accessibility, Interoperability and Reusability) principles in order to ensure transparency, reproducibility and reusability of research [5,6,7]

  • In an attempt to overcome the problem, we identified a US initiative from the University of Arizona called CyVerse US, which supports these processes from data generation, management, sharing and collaboration to analytics

  • As extra benefits, it enables the usage of existing storage solutions and utilisation of available high performance computing (HPC) clusters to perform data analytics using tools in Docker containers

Read more

Summary

Introduction

Over recent years, increasing numbers of gatekeepers such as funding organizations (the European Commission, and on the national level, French ANR [1], US NIH [2], UK Wellcome Trust [3], and the Austrian FWF [4]) and journals demand data sharing, data management and data stewardship according to FAIR (Findability, Accessibility, Interoperability and Reusability) principles in order to ensure transparency, reproducibility and reusability of research [5,6,7] By meeting those requirements, the implementation of FAIR principles provides advantages for different stakeholders such as (i) researchers receiving credit for their work and benefitting from shared data by other researchers,. Other disciplines such as life sciences (LS) already have established tools for long-term storage (e.g., https://www.ncbi.nlm.nih.gov/, https://www.ebi.ac.uk/) and standards for FAIR practice (e.g., http://www.dcc.ac.uk/resources/metadata-standards/abcd-access-biological-collection-data)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.