Abstract

As our knowledge of the complexity of gene architecture grows, and we increase our understanding of the subtleties of gene expression, the process of accurately describing disease-causing gene variants has become increasingly problematic. In part, this is due to current reference DNA sequence formats that do not fully meet present needs. Here we present the Locus Reference Genomic (LRG) sequence format, which has been designed for the specific purpose of gene variant reporting. The format builds on the successful National Center for Biotechnology Information (NCBI) RefSeqGene project and provides a single-file record containing a uniquely stable reference DNA sequence along with all relevant transcript and protein sequences essential to the description of gene variants. In principle, LRGs can be created for any organism, not just human. In addition, we recognize the need to respect legacy numbering systems for exons and amino acids and the LRG format takes account of these. We hope that widespread adoption of LRGs - which will be created and maintained by the NCBI and the European Bioinformatics Institute (EBI) - along with consistent use of the Human Genome Variation Society (HGVS)-approved variant nomenclature will reduce errors in the reporting of variants in the literature and improve communication about variants affecting human health. Further information can be found on the LRG web site: http://www.lrg-sequence.org.

Highlights

  • In 1993 Ernest Beutler wrote an eloquent letter to the editor of the American Journal of Human Genetics highlighting the deficiencies of the systems used to describe DNA variants [1]

  • The last 17 years have borne witness to the steady development of the nomenclature used to describe sequence variation that is maintained under the auspices of the Human Genome Variation Society (HGVS) [3,4]

  • The present nomenclature may seem like an arcane art-form jealously guarded by zealots. This may have been a valid criticism in the past, but advances in human genetics mean that embracing the nomenclature fully is essential

Read more

Summary

Introduction

In 1993 Ernest Beutler wrote an eloquent letter to the editor of the American Journal of Human Genetics highlighting the deficiencies of the systems used to describe DNA variants [1]. Additional annotations, known as the ‘updatable-annotation layer’, that may change with time (each item carrying its own date stamp) will provide ancillary information about a gene Such annotations will include details of additional transcripts and information for mapping the LRG sequence onto genome assemblies (for example, currently NCBI 36 and Genome Reference Consortium Human (GRCh) 37) as well as crossreferencing of features in the fixed-annotation layer to legacy coordinate systems. Just as with Mutalyzer, such systems would parse the annotated features of reference DNA sequences to provide the necessary visual cues to help generate an HGVS-nomenclature-compliant description of any variant Such a system would incorporate crosschecking with legacy numbering systems.

Beutler E
11. GEN2PHEN
37. Schechter AN
41. Beutler E
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.