Abstract

BackgroundThe genomic data available nowadays has enabled the study of repetitive sequences and their relationship to viruses. Among them, long terminal repeat retrotransposons (LTR-RTs) are the largest component of most plant genomes, the Gypsy and Copia superfamilies being the most common. Recently it has been found that Del lineage, an LTR-RT of Gypsy superfamily, has putative virus-like attachment (vl-att) sites. This signature, originally described for retroviruses, is recognized by retroviral integrase conferring specificity to the integration process.ResultsHere we retrieved 26,092 putative complete LTR-RTs from 10 lineages found in 10 fully sequenced angiosperm genomes and found putative vl-att sites that are a conserved structural landmark across these genomes. Furthermore, we reveal that each plant genome has a distinguishable LTR-RT lineage amplification pattern that could be related to the vl-att sites diversity. We used these patterns to generate a specific quick-response (QR) code for each genome that could be used as a barcode of identification of plants in the future.ConclusionsThe universal distribution of vl-att sites represents a new structural feature common to plant LTR-RTs and retroviruses. This is an important finding that expands the information about the structural similarity between LTR-RT and retroviruses. We speculate that the sequence diversity of vl-att sites could be important for the life cycle of retrotransposons, as it was shown for retroviruses. All the structural vl-att site signatures are strong candidates for further functional studies. Moreover, this is the first identification of specific LTR-RT content and their amplification patterns in a large dataset of LTR-RT lineages and angiosperm genomes. These distribution patterns could be used in the future with biotechnological identification purposes.Electronic supplementary materialThe online version of this article (doi:10.1186/s13100-016-0069-5) contains supplementary material, which is available to authorized users.

Highlights

  • The genomic data available nowadays has enabled the study of repetitive sequences and their relationship to viruses

  • We reveal that each plant genome has a distinguishable LTR retrotransposon (LTR-RT) lineage amplification pattern that could be related to the vl-att sites diversity

  • We speculate that the sequence diversity of vl-att sites could be important for the life cycle of retrotransposons, as it was shown for retroviruses

Read more

Summary

Introduction

The genomic data available nowadays has enabled the study of repetitive sequences and their relationship to viruses. Long terminal repeat retrotransposons (LTR-RTs) are the largest component of most plant genomes, the Gypsy and Copia superfamilies being the most common. Since the genome of Arabidopsis thaliana was sequenced in 2000, 55 other plant genomes have been released and published [1, 2] This has advanced our understanding of genome composition, such as the discovery that repetitive sequences are major constituents of most genomes [3]. The predominant TE found in plant genomes is the long terminal repeat retrotransposons (LTR-RTs) It represents ~79 % of the maize (~2.3 Gb total) and ~55 % of the sorghum (~730 Mb total) genomes [7,8,9,10,11]. Based on sequence similarities and on the structural/domains organization, LTR-RTs are divided into two major superfamilies: the Gypsy and the

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call