Abstract

Given the growing wealth of downstream information, the integration of molecular and non-molecular data on a given organism has become a major challenge. For micro-organisms, this information now includes a growing collection of sequenced genes and complete genomes, and for communities of organisms it includes metagenomes. Integration of the data is facilitated by the existence of authoritative, community-recognized, consensus identifiers that may form the heart of so-called information knuckles. The Genomic Standards Consortium (GSC) is building a mapping of identifiers across a group of federated databases with the aim to improve navigation across these resources and to enable the integration of their information in the near future. In particular, this is possible because of the existence of INSDC Genome Project Identifiers (GPIDs) and accession numbers, and the ability of the community to define new consensus identifiers such as the culture identifiers used in the StrainInfo.net bioportal. Here we outline (1) the general design of the Genomic Rosetta Stone project, (2) introduce example linkages between key databases (that cover information about genomes, 16S rRNA gene sequences, and microbial biological resource centers), and (3) make an open call for participation in this project providing a vision for its future use.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.