DDBJ Database updates and computational infrastructure enhancement.

Osamu Ogasawara,Takatomo Fujisawa,Yuichi Kodama,Takehide Kosuge,Jun Mashima

doi:10.1093/nar/gkz982

Osamu Ogasawara, Takatomo Fujisawa + Show 3 more

Open Access

PDF Available

https://doi.org/10.1093/nar/gkz982

Copy DOI

Export

Save

Cite

Journal: Nucleic acids research	Publication Date: Nov 14, 2019
Citations: 35	License type: CC BY 4.0

Affiliation: National Institute of Genetics

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

The Bioinformation and DDBJ Center (https://www.ddbj.nig.ac.jp) in the National Institute of Genetics (NIG) maintains a primary nucleotide sequence database as a member of the International Nucleotide Sequence Database Collaboration (INSDC) in partnership with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The NIG operates the NIG supercomputer as a computational basis for the construction of DDBJ databases and as a large-scale computational resource for Japanese biologists and medical researchers. In order to accommodate the rapidly growing amount of deoxyribonucleic acid (DNA) nucleotide sequence data, NIG replaced its supercomputer system, which is designed for big data analysis of genome data, in early 2019. The new system is equipped with 30 PB of DNA data archiving storage; large-scale parallel distributed file systems (13.8 PB in total) and 1.1 PFLOPS computation nodes and graphics processing units (GPUs). Moreover, as a starting point of developing multi-cloud infrastructure of bioinformatics, we have also installed an automatic file transfer system that allows users to prevent data lock-in and to achieve cost/performance balance by exploiting the most suitable environment from among the supercomputer and public clouds for different workloads.

Highlights

The deoxyribonucleic acid (DNA) Data Bank of Japan (DDBJ) [1] is a public database of nucleotide sequences established at the National Institute of Genetics (NIG)
Since 1987, the DDBJ Center has been collecting annotated nucleotide sequences as its traditional database service. This endeavour has been conducted in collaboration with GenBank [2] at the US National Center for Biotechnology Information (NCBI) and in partnership with the European Nucleotide Archive (ENA) [3] at the European Bioinformatics Institute (EBI)
We report on updates to the abovementioned services at the DDBJ Center, and on the new supercomputer system

Summary

Introduction

For human individual genotype and phenotype data requiring authorized access, the DDBJ Center has provided the controlled-access database Japanese Genotypephenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in the Japan Science and Technology Agency (JST) since 2013 [10]. The supercomputer system operated by the NIG as a computational infrastructure for developing the DDBJ databases is provided for use as large-scale computational resources to Japanese researchers in the fields of medicine and biology [11].

Results

Conclusion