Abstract

The Human Genome Variation Society (HGVS) nomenclature guidelines encourage the accurate and standard description of DNA, RNA, and protein sequence variants in public variant databases and the scientific literature. Inconsistent application of the HGVS guidelines can lead to misinterpretation of variants in clinical settings. Reliable software tools are essential to ensure consistent application of the HGVS guidelines when reporting and interpreting variants. We present the hgvs Python package, a comprehensive tool for manipulating sequence variants according to the HGVS nomenclature guidelines. Distinguishing features of the hgvs package include: (1) parsing, formatting, validating, and normalizing variants on genome, transcript, and protein sequences; (2) projecting variants between aligned sequences, including those with gapped alignments; (3) flexible installation using remote or local data (fully local installations eliminate network dependencies); (4) extensive automated tests; and (5) open source development by a community from eight organizations worldwide. This report summarizes recent and significant updates to the hgvs package since its original release in 2014, and presents results of extensive validation using clinical relevant variants from ClinVar and HGMD.

Highlights

  • The standardized representation of genomic, transcript and protein sequence variants is essential in biomedical research and clinical genetics

  • We have presented the upgraded hgvs package, a comprehensive tool to manipulate variant representations according to the Human Genome Variation Society (HGVS) nomenclature

  • We demonstrated a high concordance of projection and normalization functions using 284,993 unique transcript variants and 309,899 unique genomic variants from ClinVar; where these were discordant, the hgvs package generated the representation preferred by HGVS guidelines

Read more

Summary

Introduction

The standardized representation of genomic, transcript and protein sequence variants is essential in biomedical research and clinical genetics. Accurate interpretation of sequence variants in genetic tests—and, the resulting patient diagnosis—depends on variants being described, communicated, and compared using consistent representations. The Human Genome Variation Society (HGVS) nomenclature guidelines, first proposed in 1998 (Antonarakis, 1998; den Dunnen & Antonarakis, 2000), have become the de facto international standard for reporting sequence variants (Li et al, 2017; Richards et al, 2015). With the widespread adoption of high-throughput sequencing and the complexity of DNA, RNA, and protein variants, the HGVS nomenclature has continued to evolve (den Dunnen et al, 2016). It is challenging for researchers to manually check all the guidelines in the HGVS nomenclature for each variant discovered in modern sequencing-based studies. Significant discordance in the reported HGVS representations across four variant annotation

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.