Abstract

BackgroundPlant LTR-retrotransposons are classified into two superfamilies, Ty1/copia and Ty3/gypsy. They are further divided into an enormous number of families which are, due to the high diversity of their nucleotide sequences, usually specific to a single or a group of closely related species. Previous attempts to group these families into broader categories reflecting their phylogenetic relationships were limited either to analyzing a narrow range of plant species or to analyzing a small numbers of elements. Furthermore, there is no reference database that allows for similarity based classification of LTR-retrotransposons.ResultsWe have assembled a database of retrotransposon encoded polyprotein domains sequences extracted from 5410 Ty1/copia elements and 8453 Ty3/gypsy elements sampled from 80 species representing major groups of green plants (Viridiplantae). Phylogenetic analysis of the three most conserved polyprotein domains (RT, RH and INT) led to dividing Ty1/copia and Ty3/gypsy retrotransposons into 16 and 14 lineages respectively. We also characterized various features of LTR-retrotransposon sequences including additional polyprotein domains, extra open reading frames and primer binding sites, and found that the occurrence and/or type of these features correlates with phylogenies inferred from the three protein domains.ConclusionsWe have established an improved classification system applicable to LTR-retrotransposons from a wide range of plant species. This system reflects phylogenetic relationships as well as distinct sequence and structural features of the elements. A comprehensive database of retrotransposon protein domains (REXdb) that reflects this classification provides a reference for efficient and unified annotation of LTR-retrotransposons in plant genomes. Access to REXdb related tools is implemented in the RepeatExplorer web server (https://repeatexplorer-elixir.cerit-sc.cz/) or using a standalone version of REXdb that can be downloaded seaparately from RepeatExplorer web page (http://repeatexplorer.org/).

Highlights

  • Plant Long terminal repeats (LTR)-retrotransposons are classified into two superfamilies, Ty1/copia and Ty3/gypsy

  • Since the 5′ LTR and 3′ LTR are identical at the time of insertion of a new element copy to the genome the level of their divergence which is caused by mutations acquired over time is proportional to the insertion age

  • In order to be able to compare our data with sequences of previously described elements, additional LTR-retrotransposon nucleotide sequences were added from public databases [39,40,41] and from published studies [11, 24, 33]

Read more

Summary

Introduction

Plant LTR-retrotransposons are classified into two superfamilies, Ty1/copia and Ty3/gypsy They are further divided into an enormous number of families which are, due to the high diversity of their nucleotide sequences, usually specific to a single or a group of closely related species. Long terminal repeats (LTR) retrotransposons are a very large and diverse group of transposable elements that are ubiquitous in eukaryotes They are abundant in plant genomes, making up to 75% of nuclear DNA [1]. LTRretrotransposons are often viewed as genomic parasites they may be beneficial to their hosts by providing regulatory genetic elements [7], driving rapid genomic changes [8, 9] or being an integral part of specific genome regions such as centromeres [10, 11] Investigation of these processes is crucial to understanding genome evolution and function. These efforts are complicated by the absence of a general and applicable system of classification for these highly diverse elements

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call