Abstract
The Genomic Expression Archive (GEA) for functional genomics data from microarray and high-throughput sequencing experiments has been established at the DNA Data Bank of Japan (DDBJ) Center (https://www.ddbj.nig.ac.jp), which is a member of the International Nucleotide Sequence Database Collaboration (INSDC) with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center collects nucleotide sequence data and associated biological information from researchers and also services the Japanese Genotype–phenotype Archive (JGA) with the National Bioscience Database Center for collecting human data. To automate the submission process, we have implemented the DDBJ BioSample validator which checks submitted records, auto-corrects their format, and issues error messages and warnings if necessary. The DDBJ Center also operates the NIG supercomputer, prepared for analyzing large-scale genome sequences. We now offer a secure platform specifically to handle personal human genomes. This report describes database activities for INSDC and JGA over the past year, the newly launched GEA, submission, retrieval, and analysis services available in our supercomputer system and their recent developments.
Highlights
The DNA Data Bank of Japan (DDBJ, https://www.ddbj. nig.ac.jp) [1] is a public nucleotide sequence database established at the National Institute of Genetics (NIG, https: //www.nig.ac.jp)
Since 1987, the DDBJ Center has been collecting annotated nucleotide sequences as its traditional database service. This endeavor is conducted in collaboration with GenBank [2] at the National Center for Biotechnology Information (NCBI) and with the European Nucleotide Archive (ENA) [3] at the European Bioinformatics Institute (EBI)
Besides the Gene Expression Omnibus (GEO) at the NCBI [8] and ArrayExpress at the EBI [9], the Genomic Expression Archive (GEA) issues accession numbers to functional genomics experiments, whose data are associated with metadata in a structured and standardized MAGE-TAB format [10], and public GEA data will be indexed by ArrayExpress
Summary
Since 1987, the DDBJ Center has been collecting annotated nucleotide sequences as its traditional database service. Within the INSDC framework, the DDBJ Center services the DDBJ Sequence Read Archive (DRA) for raw sequencing data and alignment information from highthroughput sequencing platforms [5], BioProject for sequencing project metadata, and BioSample for sample information [1,6]. In July 2018, the DDBJ Center launched a new public database, the Genomic Expression Archive (GEA, https://www.ddbj.nig.ac.jp/gea), which collects functional genomics data from microarray and high-throughput sequencing experiments.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.