Abstract

Insertions and excisions of transposable elements (TEs) affect both the stability and variability of the genome. Studying the dynamics of transposition at the population level can provide crucial insights into the processes and mechanisms of genome evolution. Pooling genomic materials from multiple individuals followed by high-throughput sequencing is an efficient way of characterizing genomic polymorphisms in a population. Here we describe a novel method named TEMP, specifically designed to detect TE movements present with a wide range of frequencies in a population. By combining the information provided by pair-end reads and split reads, TEMP is able to identify both the presence and absence of TE insertions in genomic DNA sequences derived from heterogeneous samples; accurately estimate the frequencies of transposition events in the population and pinpoint junctions of high frequency transposition events at nucleotide resolution. Simulation data indicate that TEMP outperforms other algorithms such as PoPoolationTE, RetroSeq, VariationHunter and GASVPro. TEMP also performs well on whole-genome human data derived from the 1000 Genomes Project. We applied TEMP to characterize the TE frequencies in a wild Drosophila melanogaster population and study the inheritance patterns of TEs during hybrid dysgenesis. We also identified sequence signatures of TE insertion and possible molecular effects of TE movements, such as altered gene expression and piRNA production. TEMP is freely available at github: https://github.com/JialiUMassWengLab/TEMP.git.

Highlights

  • Transposable element (TE) mobilization is one of the major sources of genomic variation and a potential driving force of evolution [1,2,3]

  • We demonstrated TEMP’s performance by comparing it with PoPoolationTE, RetroSeq, and two general-purpose structural variation discovery algorithms VariationHunter and GASVPro using simulated data

  • Other input files required by TEMP are transposon consensus sequences, which can be downloaded from Repbase (Version 17.07, http://www.girinst.org/repbase/), and RepeatMasker files containing the annotated TEs in the reference genome, which can be downloaded from the UCSC Genome Browser

Read more

Summary

Introduction

Transposable element (TE) mobilization is one of the major sources of genomic variation and a potential driving force of evolution [1,2,3]. Much progress has been made in discovering structural variations from high-throughput genomic DNA sequencing data [5,6,7]. Just as any other types of genomic variation, it would be extremely useful to estimate the population frequency of polymorphic transposition events. Sequencing a large number of individuals in a population separately is impossible under many circumstances because of the prohibitively high costs and the difficulty in obtaining enough experimental material. Pooled sequencing is a widely employed experimental practice whereby investigators pool tissues from multiple individuals (or organisms) and sequence the DNA (or RNA) without knowing which read originates from which individual (or organism) [8,9,10,11]. When analyzed with an effective computational algorithm, this approach can accurately estimate the population frequency of transposition events

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.