Abstract

Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes.Database URL: http://data.mypathogen.org

Highlights

  • With the rapid development of next-generation sequencing, the enormous amounts of bacterial DNA sequence data continuously emerging have brought forth a challenge for both academic users as well as database curators [1]

  • A database system designed to host a range of pathogenic microbial genomes would be extremely helpful for Centers for Disease Control (CDC), clinical and epidemiological research

  • In the Mypathogen database (MPD) database, all genomic and metagenomic raw sequencing data and quality-control data hosted on the website are generated in the standard FASTQ format; the assembled data are in FASTA format; the genome feature annotations are in GFF, CDS, and PEP; and the diversity analysis output for metagenome data are in TXT format

Read more

Summary

Introduction

With the rapid development of next-generation sequencing, the enormous amounts of bacterial DNA sequence data continuously emerging have brought forth a challenge for both academic users as well as database curators [1]. We here introduce the Mypathogen database (MPD), a management system for microbial genomes, which was developed to provide researchers access to searching, downloading and sharing bacterial genomics data, and should be helpful for CDC clinical and epidemiological research. The MPD is a database system designed to host a range of pathogenic microbial genomes and to provide users access to searching, downloading and sharing genomics data.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.