Abstract

BackgroundThe development of high-throughput sequencing and analysis has accelerated multi-omics studies of thousands of microbial species, metagenomes, and infectious disease pathogens. Omics studies are enabling genotype-phenotype association studies which identify genetic determinants of pathogen virulence and drug resistance, as well as phylogenetic studies designed to track the origin and spread of disease outbreaks. These omics studies are complex and often employ multiple assay technologies including genomics, metagenomics, transcriptomics, proteomics, and metabolomics. To maximize the impact of omics studies, it is essential that data be accompanied by detailed contextual metadata (e.g., specimen, spatial-temporal, phenotypic characteristics) in clear, organized, and consistent formats. Over the years, many metadata standards developed by various metadata standards initiatives have arisen; the Genomic Standards Consortium’s minimal information standards (MIxS), the GSCID/BRC Project and Sample Application Standard. Some tools exist for tracking metadata, but they do not provide event based capabilities to configure, collect, validate, and distribute metadata. To address this gap in the scientific community, an event based data-driven application, OMeta, was created that allows users to quickly configure, collect, validate, distribute, and integrate metadata.ResultsA data-driven web application, OMeta, has been developed for use by researchers consisting of a browser-based interface, a command-line interface (CLI), and server-side components that provide an intuitive platform for configuring, capturing, viewing, and sharing metadata. Project and sample metadata can be set based on existing standards or based on projects goals. Recorded information includes details on the biological samples, procedures, protocols, and experimental technologies, etc. This information can be organized based on events, including sample collection, sample quantification, sequencing assay, and analysis results. OMeta enables configuration in various presentation types: checkbox, file, drop-box, ontology, and fields can be configured to use the National Center for Biomedical Ontology (NCBO), a biomedical ontology server. Furthermore, OMeta maintains a complete audit trail of all changes made by users and allows metadata export in comma separated value (CSV) format for convenient deposition of data into public databases.ConclusionsWe present, OMeta, a web-based software application that is built on data-driven principles for configuring and customizing data standards, capturing, curating, and sharing metadata.

Highlights

  • The development of high-throughput sequencing and analysis has accelerated multi-omics studies of thousands of microbial species, metagenomes, and infectious disease pathogens

  • OMeta has been used by multiple studies and center projects like Genomic Sequencing Center for Infectious Diseases (GSCID)/Genomic Center for Infectious Diseases (GCID), J. Craig Venter Institute (JCVI) Human Microbiome Project (HMP) and Data Processing and Coordinating Center (DPCC) of the National Institute of Allergy and Infectious Diseases (NIAID) Centers of Excellence for Influenza Research and Surveillance (CEIRS)

  • The scientific research community recognizes the importance and necessity of standards and metadata collection for biological samples and experiments as they pertain to fundamental research

Read more

Summary

Introduction

The development of high-throughput sequencing and analysis has accelerated multi-omics studies of thousands of microbial species, metagenomes, and infectious disease pathogens. Omics studies are enabling genotype-phenotype association studies which identify genetic determinants of pathogen virulence and drug resistance, as well as phylogenetic studies designed to track the origin and spread of disease outbreaks These omics studies are complex and often employ multiple assay technologies including genomics, metagenomics, transcriptomics, proteomics, and metabolomics. Omics tools and technologies are enabling genotype-phenotype association studies that identify genetic determinants of pathogen virulence and drug resistance as well as phylogenetic studies designed to track the origin and spread of pathogens during disease outbreaks These omics studies are complex and often employ multiple technologies, including genomics, metagenomics, transcriptomics, proteomics, and metabolomics. A summary of tools and their features is described in the discussion To address this critical need for the scientific community, we built an event based, data-driven application, OMeta, which allows users to quickly configure, collect, validate, distribute, and integrate metadata. As the number of projects increased, we encountered challenges of keeping metadata standards and metadata harmonized with evolving metadata tracking and validation requirements

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.