Abstract

One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.

Highlights

  • Transposable Elements (TEs) constitute the main part of the nuclear DNA content of plant genomes

  • TEs can be activated through a large panel of biotic and abiotic stresses ([5,6]), suggesting that they could play a significant role in the environmental adaptation of species [7]

  • We report the development of Inpactor, a parallel and scalable pipeline, able to classify Long Terminal Repeats (LTRs) retrotransposons, to identify autonomous and non-autonomous elements, to perform

Read more

Summary

Introduction

Transposable Elements (TEs) constitute the main part of the nuclear DNA content of plant genomes. This is true for large genomes of cereals such as wheat, barley and maize, for which up to 85% of the sequenced DNA is classified into repeated sequences [1]. The most common transposable elements in plants genomes are LTR retrotransposons, because they replicate by a “copy and paste” mechanism. They represent 75% of the maize genome [9], 67% of wheat ([1,10]), 55% of Sorghum bicolor [11] and 42% of the coffee genome [12]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call