Abstract

Mutations that add, subtract, rearrange, or otherwise refashion genome structure often affect phenotypes, although the fragmented nature of most contemporary assemblies obscures them. To discover such mutations, we assembled the first new reference-quality genome of Drosophila melanogaster since its initial sequencing. By comparing this new genome to the existing D. melanogaster assembly, we created a structural variant map of unprecedented resolution and identified extensive genetic variation that has remained hidden until now. Many of these variants constitute candidates underlying phenotypic variation, including tandem duplications and a transposable element insertion that amplifies the expression of detoxification-related genes associated with nicotine resistance. The abundance of important genetic variation that still evades discovery highlights how crucial high-quality reference genomes are to deciphering phenotypes.

Highlights

  • Mutations that add, subtract, rearrange, or otherwise refashion genome structure often affect phenotypes, the fragmented nature of most contemporary assemblies obscures them

  • The development of high-throughput short-read sequencing led to a steep drop in cost and a commensurate increase in the pace of sequencing[8], it led to a focus on single-nucleotide changes and small indels[3,9]

  • We saw evidence of this in cDNA from ISO119 and in RNA-seq reads in A4 that showed exon junctions flankng transposable element (TE) insertions (Supplementary Figs. 20–22 and Supplementary Table 6), which represents a genome-wide view of TE-derived introns segregating in a population

Read more

Summary

10 Mb 20 Mb

They are poorly tagged by common variants, complicating genomewide association study (GWAS) approaches for mapping traits; this mirrors similar complications in human GWAS24. Non-TE insertions represented 20% of ISO1 and 23% of A4 insertions, and they accounted for 170 kb of sequence variation (Fig. 1d and Table 1) These mutations were much smaller than TEs (median 213 bp versus 4.7 kb), they often affected genes, and 23% even escaped detection by short reads (Fig. 1b). We compared published gene expression data from larvae of A4 to expression data for a DSPR strain called A330 and identified 17 A4 duplicate genes that are single copy in ISO1 with increased expression (Supplementary Table 8), including genes previously identified as candidates for cold adaptation, olfactory response, and toxin resistance, among others (Fig. 3a,d and Supplementary Tables 8 and 9) Eight of these CNVs were invisible to short-read methods (Supplementary Table 8). In the 5,208,000 5,209,000 5,210,000 5,211,000 5,212,000 5,213,000 5,214,000 5,215,000

Ugt86Dj e 10
Methods
Randomization
Blinding
Statistical parameters
Findings
Antibodies
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call