KoNA: Korean Nucleotide Archive as A New Data Repository for Nucleotide Sequence Data.

Gunhwan Ko,Seungwoo Hwang,Kiwon Jang,Seon-Young Kim,Iksu Byeon,Jae Ho Lee,Ji-Hwan Park,Byung-Ha Yoon,Jong-Hwan Kim,Jinhyuk Choi,Pan-Gyu Kim,Sora Kim,Young Mi Sim,Bang Hyuck Lee,Sang-Ok Kim,Jaeeun Jung,Insoo Jang,Wangho Song,Jin Ok Yang,Hyerin Kim,Byungwook Lee,Jongbum Jeon

doi:10.1093/gpbjnl/qzae017

Abstract

During the last decade, the generation and accumulation of petabase-scale high-throughput sequencing data have resulted in great challenges, including access to human data, as well as transfer, storage, and sharing of enormous amounts of data. To promote data-driven biological research, the Korean government announced that all biological data generated from government-funded research projects should be deposited at the Korea BioData Station (K-BDS), which consists of multiple databases for individual data types. Here, we introduce the Korean Nucleotide Archive (KoNA), a repository of nucleotide sequence data. As of July 2022, the Korean Read Archive in KoNA has collected over 477 TB of raw next-generation sequencing data from national genome projects. To ensure data quality and prepare for international alignment, a standard operating procedure was adopted, which is similar to that of the International Nucleotide Sequence Database Collaboration. The standard operating procedure includes quality control processes for submitted data and metadata using an automated pipeline, followed by manual examination. To ensure fast and stable data transfer, a high-speed transmission system called GBox is used in KoNA. Furthermore, the data uploaded to or downloaded from KoNA through GBox can be readily processed using a cloud computing service called Bio-Express. This seamless coupling of KoNA, GBox, and Bio-Express enhances the data experience, including submission, access, and analysis of raw nucleotide sequences. KoNA not only satisfies the unmet needs for a national sequence repository in Korea but also provides datasets to researchers globally and contributes to advances in genomics. The KoNA is available at https://www.kobic.re.kr/kona/.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genomics, proteomics & bioinformatics	Publication Date: Mar 1, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

KoNA: Korean Nucleotide Archive as A New Data Repository for Nucleotide Sequence Data.

Abstract

Talk to us

Similar Papers

More From: Genomics, proteomics & bioinformatics

Lead the way for us

Similar Papers

Towards the unification of sequence-based classification and sequence-based identification of host-associated microorganisms.
Joshua R Herr ... Maarja Öpik
New Phytologist | VOL. 205
Joshua R Herr, et. al.Joshua R Herr ... Maarja Öpik
26 Nov 2014
New Phytologist | VOL. 205

Aspects of NCBI GenBank as a Biodiversity Information Resource
Takeru Nakazato
Biodiversity Information Science and Standards | VOL. 8
Takeru NakazatoTakeru Nakazato
24 Sep 2024
Biodiversity Information Science and Standards | VOL. 8

Enabling Community Curation of Biological Source Annotations of Molecular Data Through PlutoF and the ELIXIR Contextual Data Clearinghouse
Vishnukumar Balavenkataraman Kadhirvelu ... Suran Jayathilaka
Biodiversity Information Science and Standards | VOL. 6
Vishnukumar Balavenkataraman Kadhirvelu, et. al.Vishnukumar Balavenkataraman Kadhirvelu ... Suran Jayathilaka
23 Aug 2022
Biodiversity Information Science and Standards | VOL. 6

CDinFusion--submission-ready, on-line integration of sequence and contextual data.
Wolfgang Hankeln ... Jan Gerken
PLoS ONE | VOL. 6
Wolfgang Hankeln, et. al.Wolfgang Hankeln ... Jan Gerken
13 Sep 2011
PLoS ONE | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

KoNA: Korean Nucleotide Archive as A New Data Repository for Nucleotide Sequence Data.

Abstract

Talk to us

Similar Papers

More From: Genomics, proteomics & bioinformatics