Abstract
Over the past decade, the Database of Genomic Variants (DGV; http://dgv.tcag.ca/) has provided a publicly accessible, comprehensive curated catalogue of structural variation (SV) found in the genomes of control individuals from worldwide populations. Here, we describe updates and new features, which have expanded the utility of DGV for both the basic research and clinical diagnostic communities. The current version of DGV consists of 55 published studies, comprising >2.5 million entries identified in >22 300 genomes. Studies included in DGV are selected from the accessioned data sets in the archival SV databases dbVar (NCBI) and DGVa (EBI), and then further curated for accuracy and validity. The core visualization tool (gbrowse) has been upgraded with additional functions to facilitate data analysis and comparison, and a new query tool has been developed to provide flexible and interactive access to the data. The content from DGV is regularly incorporated into other large-scale genome reference databases and represents a standard data resource for new product and database development, in particular for copy number variation testing in clinical labs. The accurate cataloguing of variants in DGV will continue to enable medical genetics and genome sequencing research.
Highlights
Structural variation (SV) refers to the balanced or unbalanced changes in DNA content, which include both cytogenetically visible, submicroscopic and even smaller sequence-level variants
In the past 10 years, new genomic technologies of increasing resolution have revealed SV to be ubiquitous in all human DNA and often involved in disease [1], with unbalanced alterations of DNA, called copy number variations (CNVs) or smaller insertion/deletion events encompassing an order of magnitude more nucleotides than even single nucleotide polymorphisms (SNPs) [2]
The majority of the early studies in Database of Genomic Variants (DGV) were generated from low-resolution microarrays on a limited number of samples, which often had both high false-positive and false-negative rates [7]
Summary
Structural variation (SV) refers to the balanced or unbalanced changes in DNA content, which include both cytogenetically visible, submicroscopic and even smaller sequence-level variants. A pipeline was developed to exchange data between the DGVa and dbVar archives [15], and from the archives all data sets describing SV in healthy human control samples are sent to DGV for curation, interpretation and display (Supplementary Figure S1). Authors are encouraged to submit their raw data to the appropriate archive, either Gene Expression Omnibus [16] or Array Express [17] and processed variant calls to DGVa or dbVar. Provided the study passes curation and quality control, it will be selected for inclusion and display in DGV.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.