Abstract

The uses of the Genome Reference Consortium’s human reference sequence can be roughly categorized into three related but distinct categories: as a representative species genome, as a coordinate system for identifying variants, and as an alignment reference for variation detection algorithms. However, the use of this reference sequence as simultaneously a representative species genome and as an alignment reference leads to unnecessary artifacts for structural variation detection algorithms and limits their accuracy. We show how decoupling these two references and developing a separate alignment reference can significantly improve the accuracy of structural variation detection, lead to improved genotyping of disease related genes, and decrease the cost of studying polymorphism in a population.

Highlights

  • The initial sequencing and assembly of a human reference genome allowed for the understanding of our genomic landscape in comparison to other species [1, 2]

  • We note that none of the Venter Novel Alleles (VNAs) are present in the database of genomic structural variation, except as entries from the HuRef study itself

  • The increased power offered by ref+ can help in genotyping variants of clinical importance, as some of the VNAs affect genes that play a role in disease

Read more

Summary

Introduction

The initial sequencing and assembly of a human reference genome allowed for the understanding of our genomic landscape in comparison to other species [1, 2]. It facilitated our understanding of polymorphism within the human species by providing a high-resolution coordinate system onto which variants could be mapped [2]. The uses of the human reference sequence can be roughly categorized into three related but distinct categories: as a representative species genome, as a coordinate system for identifying variants, and as an alignment reference for variation detection algorithms. One notable exception is the idea of a human pan-genome, which has been introduced [3] to distinguish the representative species genome from the GRC reference

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call