Abstract

Transposable elements (TEs) are ubiquitous components of eukaryotic genomes that impact many aspects of genome function. TE detection in genomic sequences is typically performed using similarity searches against a set of reference sequences built from previously identified TEs. Here, we demonstrate that this process can be improved by designing reference sets that incorporate key aspects of the structure and evolution of TEs and by combining these sets with Repbase Update (RU), which is composed mainly of consensus sequences. Using the Arabidopsis genome as a test case, our approach leads to the detection of an extra 12.4% of TE sequences. These correspond to novel TE fragments as well as to the extension of TE fragments already detected by RU. Significantly, we find that TE detection could be readily optimized using only two reference sets, one containing true consensus sequences and the other mosaic sequences that capture the structural diversity of TE copies within a family.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.