Abstract
The Bioinformation and DDBJ Center (https://www.ddbj.nig.ac.jp) in the National Institute of Genetics (NIG) maintains a primary nucleotide sequence database as a member of the International Nucleotide Sequence Database Collaboration (INSDC) in partnership with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The NIG operates the NIG supercomputer as a computational basis for the construction of DDBJ databases and as a large-scale computational resource for Japanese biologists and medical researchers. In order to accommodate the rapidly growing amount of deoxyribonucleic acid (DNA) nucleotide sequence data, NIG replaced its supercomputer system, which is designed for big data analysis of genome data, in early 2019. The new system is equipped with 30 PB of DNA data archiving storage; large-scale parallel distributed file systems (13.8 PB in total) and 1.1 PFLOPS computation nodes and graphics processing units (GPUs). Moreover, as a starting point of developing multi-cloud infrastructure of bioinformatics, we have also installed an automatic file transfer system that allows users to prevent data lock-in and to achieve cost/performance balance by exploiting the most suitable environment from among the supercomputer and public clouds for different workloads.
Highlights
The deoxyribonucleic acid (DNA) Data Bank of Japan (DDBJ) [1] is a public database of nucleotide sequences established at the National Institute of Genetics (NIG)
Since 1987, the DDBJ Center has been collecting annotated nucleotide sequences as its traditional database service. This endeavour has been conducted in collaboration with GenBank [2] at the US National Center for Biotechnology Information (NCBI) and in partnership with the European Nucleotide Archive (ENA) [3] at the European Bioinformatics Institute (EBI)
We report on updates to the abovementioned services at the DDBJ Center, and on the new supercomputer system
Summary
For human individual genotype and phenotype data requiring authorized access, the DDBJ Center has provided the controlled-access database Japanese Genotypephenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in the Japan Science and Technology Agency (JST) since 2013 [10]. The supercomputer system operated by the NIG as a computational infrastructure for developing the DDBJ databases is provided for use as large-scale computational resources to Japanese researchers in the fields of medicine and biology [11].
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have