Abstract
Common bean (Phaseolus vulgaris) is an important legume crop grown and consumed worldwide. With the availability of the common bean genome sequence, the next challenge is to annotate the genome and characterize functional DNA elements. Transposable elements (TEs) are the most abundant component of plant genomes and can dramatically affect genome evolution and genetic variation. Thus, it is pivotal to identify TEs in the common bean genome. In this study, we performed a genome-wide transposon annotation in common bean using a combination of homology and sequence structure-based methods. We developed a 2.12-Mb transposon database which includes 791 representative transposon sequences and is available upon request or from www.phytozome.org. Of note, nearly all transposons in the database are previously unrecognized TEs. More than 5,000 transposon-related expressed sequence tags (ESTs) were detected which indicates that some transposons may be transcriptionally active. Two Ty1-copia retrotransposon families were found to encode the envelope-like protein which has rarely been identified in plant genomes. Also, we identified an extra open reading frame (ORF) termed ORF2 from 15 Ty3-gypsy families that was located between the ORF encoding the retrotransposase and the 3′LTR. The ORF2 was in opposite transcriptional orientation to retrotransposase. Sequence homology searches and phylogenetic analysis suggested that the ORF2 may have an ancient origin, but its function is not clear. These transposon data provide a useful resource for understanding the genome organization and evolution and may be used to identify active TEs for developing transposon-tagging system in common bean and other related genomes.
Highlights
Large portions of all sequenced plant genomes consist of highly repetitive sequences, such as transposable elements (TEs) and tandem repeats, which play crucial roles in plant genome organization
A total of 12 long terminal repeats (LTR) retrotransposon families were identified from these BAC sequences
The boundaries, target site duplications (TSDs), structures of all these sequences were manually inspected which resulted in 2,288 sequences being designed as LTR retrotransposons and another 8,061 sequences were discarded as they were either tandem repeats, incomplete transposons or other sequences
Summary
Large portions of all sequenced plant genomes consist of highly repetitive sequences, such as transposable elements (TEs) and tandem repeats, which play crucial roles in plant genome organization. In contrast to other repetitive sequences, TEs are mobile genetic elements that can move within genome or via horizontal transfer between genomes (Roulin et al, 2008). TEs can impact genome structure and evolution. Centromeric retrotransposons (CRs) may be involved in the formation of functional centromeres (Jin et al, 2004). TEs serve as important components of heterochromatin maintaining chromosome stability and heterochromatic silencing (Grewal and Jia, 2007). TEs provide raw material for evolutionary novelties, such as new gene functions, expression patterns, and networks (Cordaux and Batzer, 2009). TEs have been used as mutagens to isolate genes and characterize biological functions
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have