Abstract

The initial draft of the human reference genome was assembled in 2001 and, since then, nearly all human genomes sequenced have been mapped to this monoploid reference. With genome or exome sequencing becoming more routine for clinical diagnostics in the last 2 decades, the impact of the initial reference genome has grown. However, there have remained gaps in the reference, particularly around centromeres and telomeres, resulting predominantly from limitations in sequencing technology. Over the course of several releases, many of these gaps were addressed by the Genome Reference Consortium, most recently human build 38 (GRCh38). However, it is remarkable to consider that, until this year, over 200 megabases (mb), or approximately 8% of the human genome, remained missing from reference GRCh38 (1). The GRCh38 human genome draft was the first to characterize sequence information of the centromeres, albeit with limited coverage and accuracy. However, the telomeres remained mostly elusive. To address this, the telomere-to-telomere (T2T) consortium used DNA from a complete hydatidiform mole, taking advantage of its diploid androgenetic nature. The structure of the hydatidiform mole genome allowed the T2T investigators to better resolve genetic regions that vary considerably between parental chromosomes in each individual, creating a high-fidelity reference map. The T2T consortium also leveraged recent advances in long-read and ultra-long-read sequencing, allowing for sequencing of highly repetitive sequences that are prevalent in both centromeres and telomeres, to accurately assemble the most complete human genome draft to date (1).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call