Abstract

Heterochromatic regions of the genome are repeat-rich and poor in protein coding genes, and are therefore underrepresented in even the best genome assemblies. One of the most difficult regions of the genome to assemble are sex-limited chromosomes. The Drosophila melanogaster Y chromosome is entirely heterochromatic, yet has wide-ranging effects on male fertility, fitness, and genome-wide gene expression. The genetic basis of this phenotypic variation is difficult to study, in part because we do not know the detailed organization of the Y chromosome. To study Y chromosome organization in D. melanogaster, we develop an assembly strategy involving the in silico enrichment of heterochromatic long single-molecule reads and use these reads to create targeted de novo assemblies of heterochromatic sequences. We assigned contigs to the Y chromosome using Illumina reads to identify male-specific sequences. Our pipeline extends the D. melanogaster reference genome by 11.9 Mb, closes 43.8% of the gaps, and improves overall contiguity. The addition of 10.6 MB of Y-linked sequence permitted us to study the organization of repeats and genes along the Y chromosome. We detected a high rate of duplication to the pericentric regions of the Y chromosome from other regions in the genome. Most of these duplicated genes exist in multiple copies. We detail the evolutionary history of one sex-linked gene family, crystal-Stellate While the Y chromosome does not undergo crossing over, we observed high gene conversion rates within and between members of the crystal-Stellate gene family, Su(Ste), and PCKR, compared to genome-wide estimates. Our results suggest that gene conversion and gene duplication play an important role in the evolution of Y-linked genes.

Highlights

  • Heterochromatic regions of the genome are repeat-rich and poor in protein coding genes, and are underrepresented in even the best genome assemblies

  • Major blocks of heterochromatin including the Y chromosome are missing from the latest version of the D. melanogaster genome (R6; Hoskins et al 2015)

  • We built a new assembly of the D. melanogaster genome that closes gaps in release 6 (R6) and adds to the assembly in heterochromatin, most notably the

Read more

Summary

Introduction

Heterochromatic regions of the genome are repeat-rich and poor in protein coding genes, and are underrepresented in even the best genome assemblies. Single-molecule long-read sequencing approaches (Branton et al 2008; Eid et al 2009) are improving our ability to assemble repetitive regions of complex genomes (Huddleston et al 2014; Chaisson et al 2015; Chang and Larracuente 2017; Khost et al 2017), including the Y chromosomes of gorilla and human (Tomaszkiewicz et al 2016; Jain et al 2018; Kuderna et al unpublished data). We describe the landscape of transposable elements (TEs), the high rate of Y-linked gene duplication, and patterns of gene conversion among members of Y-linked multicopy gene families

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call