Tripal EUtils: a Tripal module to increase exchange and reuse of genome assembly metadata

B Condon,S Buehler,C P Childers,S P Ficklin,A Almsaeed,M E Staton,M F Poelchau

doi:10.1093/database/baz143

Abstract

Data and metadata interoperability between data storage systems is a critical component of the FAIR data principles. Programmatic and consistent means of reconciling metadata models between databases promote data exchange and thus increases its access to the scientific community. This process requires (i) metadata mapping between the models and (ii) software to perform the mapping. Here, we describe our efforts to map metadata associated with genome assemblies between the National Center for Biotechnology Information (NCBI) data resources and the Chado biological database schema. We present mappings for multiple NCBI data structures and introduce a Tripal software module, Tripal EUtils, to pull metadata from NCBI into a Tripal/Chado database. We discuss potential mapping challenges and solutions and provide suggestions for future development to further increase interoperability between these platforms. Database URL: https://github.com/NAL-i5K/tripal_eutils.

Highlights

BackgroundBiologists increasingly recognize the need to make data and metadata more findable, accessible, interoperable and reusable (FAIR) [1]
When data exist in two different structures—whether these are flat files, relational databases or something else—an inability to map between those structures can slow down or entirely prevent data integration and data reuse
As discussed in the previous section, we focus on National Center for Biotechnology Information (NCBI) databases that (i) make their data and metadata available via NCBI’s Eutilities and (ii) represent data and/or metadata relevant to genome assemblies

Summary

Introduction

Biologists increasingly recognize the need to make data and metadata more findable, accessible, interoperable and reusable (FAIR) [1]. These guiding principles provide a framework to guide the improvement of research data cyberinfrastructure and equip scientists to use public data to enhance knowledge discovery. Modeling the full structure of data and metadata and creating appropriate linkages between datasets require sophisticated data storage structures, such as a relational database. When data exist in two different structures—whether these are flat files, relational databases or something else—an inability to map between those structures can slow down or entirely prevent data integration and data reuse

Objectives

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Tripal EUtils: a Tripal module to increase exchange and reuse of genome assembly metadata

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database

Lead the way for us

Journal: Database	Publication Date: Jan 1, 2020
License type: cc-by

Similar Papers

Education resources of the National Center for Biotechnology Information
P. S. Cooper ... W. T. Matten
Briefings in Bioinformatics | VOL. 11
P. S. Cooper, et. al.P. S. Cooper ... W. T. Matten
22 Jun 2010
Briefings in Bioinformatics | VOL. 11

Genome Annotation Generator: a simple tool for generating and correcting WGS annotation tables for NCBI submission.
Scott M Geib ... Sheina B Sim
GigaScience | VOL. 7
Scott M Geib, et. al.Scott M Geib ... Sheina B Sim
04 Mar 2018
GigaScience | VOL. 7

High-Quality Complete Genome Resource of Pectobacterium parvum Isolate FN20211 CausingAerial Stem Rot of Potato.
Jinhui Wang ... Jianing Cheng
Molecular plant-microbe interactions : MPMI | VOL. 35
Jinhui Wang, et. al.Jinhui Wang ... Jianing Cheng
01 May 2022
High-Quality Complete Genome Resource of Pectobacterium parvum Isolate FN20211 CausingAerial Stem Rot of Potato.
Jinhui Wang ... Jianing Cheng

National Center for Biotechnology Information.
Erin E Dooley
Environmental health perspectives | VOL. 112
Erin E DooleyErin E Dooley
01 Aug 2004
Environmental health perspectives | VOL. 112

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tripal EUtils: a Tripal module to increase exchange and reuse of genome assembly metadata

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database