Abstract

Transposable elements are mobile DNA sequences that integrate into host genomes using diverse mechanisms with varying degrees of target site specificity. While the target site preferences of some engineered transposable elements are well studied, the natural target preferences of most transposable elements are poorly characterized. Using population genomic resequencing data from 166 strains of Drosophila melanogaster, we identified over 8,000 new insertion sites not present in the reference genome sequence that we used to decode the natural target preferences of 22 families of transposable element in this species. We found that terminal inverted repeat transposon and long terminal repeat retrotransposon families present clade-specific target site duplications and target site sequence motifs. Additionally, we found that the sequence motifs at transposable element target sites are always palindromes that extend beyond the target site duplication. Our results demonstrate the utility of population genomics data for high-throughput inference of transposable element targeting preferences in the wild and establish general rules for terminal inverted repeat transposon and long terminal repeat retrotransposon target site selection in eukaryotic genomes.

Highlights

  • Transposable elements (TEs) are mobile DNA sequences that can be found in virtually all organisms from prokaryotes to eukaryotes

  • Using resequencing data from 166 isofemale strains of Drosophila melanogaster produced by the Drosophila Genetic Reference Panel (DGRP) project [22,23], we identified over 8,000 new TE insertion sites not present in the reference genome sequence [24] that we use to analyze properties of target site duplication (TSD) and target site motifs (TSMs) for 22 families of terminal inverted repeats (TIRs) and long terminal repeats (LTRs) elements

  • Assuming results for the families studied here can be generalized to other TE families, the major biological findings of this work are: (i) TSDs for TIR and LTR elements are less than 10 bp in length, (ii) TSD length for TIR and LTR elements are shared by related TE families in the same clade, (iii) TSMs for TIR and LTR elements are palindromes, and (iv) target sequence preferences for TIR and LTR element-encoded TSMs extend beyond the limits of the TSD

Read more

Summary

Introduction

Transposable elements (TEs) are mobile DNA sequences that can be found in virtually all organisms from prokaryotes to eukaryotes. TEs can be categorized into two major classes according to their method of transposition: (i) those that transpose directly into the host genome via a DNA molecule (transposons), and (ii) those that transpose through an RNA intermediate (retrotransposons) [2]. A characteristic mark of TE insertion in the genome is the presence of a target site duplication (TSD), which occurs upon TE integration as a result of staggered double-strand breaks at the target site [2]. TIR and LTR elements insert into target sites as a DNA-protein complex that are thought to cause a fixed length staggered cut that is characteristic of the TE family [2]. Transposition of non-LTR elements transposition leaves a variable length staggered cut in the genome that leads to a variable distribution of TSD lengths for a given family [4]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call