Abstract

SummaryHuman biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset’s allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers’ discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.

Highlights

  • To address global scientific challenges in health, human biomedical data must be shared and integrated worldwide.[1]

  • We report on the Data Use Ontology (DUO) standard, describe the curated structured vocabulary and hierarchies, and review use cases and considerations in implementing DUO for the management and access of biomedical datasets

  • In 2019, DUO was unanimously approved as a GA4GH standard by the GA4GH Steering Committee, joining other products in the GA4GH Genomic Toolkit suite.[1]

Read more

Summary

Introduction

To address global scientific challenges in health, human biomedical data must be shared and integrated worldwide.[1] To promote discovery and improve healthcare, researchers and clinicians need to be able to find, access, harmonize, and re-use data from diverse data sources. Data access for research is often facilitated by data repositories, and in a growing number of federated data environments[2] that aggregate datasets within or among themselves and make the results available to the.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.