Abstract

The reference human genome sequence is inarguably the most important and widely used resource in the fields of human genetics and genomics. It has transformed the conduct of biomedical sciences and brought invaluable benefits to the understanding and improvement of human health. However, the commonly used reference sequence has profound limitations, because across much of its span, it represents the sequence of just one human haplotype. This single, monoploid reference structure presents a critical barrier to representing the broad genomic diversity in the human population. In this review, we discuss the modernization of the reference human genome sequence to a more complete reference of human genomic diversity, known as a human pangenome.

Highlights

  • Over the past two decades, we have seen unprecedented advancements in DNA sequencing technologies, bioinformatics, and clinical genetics

  • The increase in the amount of genomic variation data in numerous databases [e.g., dbSNP [128]] and other high-quality reference genome sequences with diverse haplotypes made it difficult to host and build tooling to fully integrate these data relative to the reference coordinate system. These challenges motivated the development of a new human pangenome reference structure, one that was fully capable of representing diversity across a large number of genomes and was supported with the necessary bioinformatics pipelines to ensure the mapping of short sequencing reads, detection of genomic variants, and discovery of functional elements [52, 82, 110]

  • As we reflect on the impact that it has had on the field of genomics, it is apparent that the lessons we have learned are invaluable

Read more

Summary

Introduction

Over the past two decades, we have seen unprecedented advancements in DNA sequencing technologies, bioinformatics, and clinical genetics. These challenges motivated the development of a new human pangenome reference structure, one that was fully capable of representing diversity across a large number of genomes and was supported with the necessary bioinformatics pipelines to ensure the mapping of short sequencing reads, detection of genomic variants, and discovery of functional elements [52, 82, 110].

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.