Abstract

A small number of de novo assembled human genomes have been reported to date, and few have been complemented with population-based genetic variation, which is particularly important for North Africa, a region underrepresented in current genome-wide references. Here, we combine long- and short-read whole-genome sequencing data with recent assembly approaches into a de novo assembly of an Egyptian genome. The assembly demonstrates well-balanced quality metrics and is complemented with variant phasing via linked reads into haploblocks, which we associate with gene expression changes in blood. To construct an Egyptian genome reference, we identify genome-wide genetic variation within a cohort of 110 Egyptian individuals. We show that differences in allele frequencies and linkage disequilibrium between Egyptians and Europeans may compromise the transferability of European ancestry-based genetic disease risk and polygenic scores, substantiating the need for multi-ethnic genome references. Thus, the Egyptian genome reference will be a valuable resource for precision medicine.

Highlights

  • A small number of de novo assembled human genomes have been reported to date, and few have been complemented with population-based genetic variation, which is important for North Africa, a region underrepresented in current genome-wide references

  • Our Egyptian genome is based on a high-quality human de novo assembly for one male Egyptian individual

  • For the FALCON-based assembly, scaffolding was performed, whereas we found that the WTDBG2-based assembly was of comparable accuracy without scaffolding (Table 1)

Read more

Summary

Introduction

A small number of de novo assembled human genomes have been reported to date, and few have been complemented with population-based genetic variation, which is important for North Africa, a region underrepresented in current genome-wide references. We present the Egyptian genome reference (Egyptref)— a comprehensive North African genetic reference data set—as an initial and essential step towards personal genomics-based precision medicine in this region. It is a combination of (i) a de novo assembled Egyptian personal genome together with phased variants and (ii) a population-genetic characterization of Egyptians based on 19,758,992 single nucleotide variants (SNVs), 121,141 SVs and mitochondrial haplogroups. We anticipate that an Egyptian population genome reference will strengthen precision medicine efforts that eventually benefit nearly 100 million Egyptians, e.g., by providing allele frequencies (AFs) and linkage disequilibrium (LD)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.