Abstract
BackgroundEragrostis tef is an allotetraploid (2n = 4 × = 40) annual, C4 grass with an estimated nuclear genome size of 730 Mbp. It is widely grown in Ethiopia, where it provides basic nutrition for more than half of the population.Although a draft assembly of the E. tef genome was made available in 2014, characterization of the repetitive portion of the E. tef genome has not been a subject of a detailed analysis.Repetitive sequences constitute most of the DNA in eukaryotic genomes. Transposable elements are usually the most abundant repetitive component in plant genomes. They contribute to genome size variation, cause mutations, can result in chromosomal rearrangements, and influence gene regulation. An extensive and in depth characterization of the repetitive component is essential in understanding the evolution and function of the genome.ResultsUsing new paired-end sequence data and a de novo repeat identification strategy, we identified the most repetitive elements in the E. tef genome. Putative repeat sequences were annotated based on similarity to known repeat groups in other grasses.Altogether we identified 1,389 medium/highly repetitive sequences that collectively represent about 27 % of the teff genome. Phylogenetic analyses of the most important classes of TEs were carried out in a comparative framework including paralog elements from rice and maize. Finally, an abundant tandem repeat accounting for more than 4 % of the whole genome was identified and partially characterized.ConclusionsAnalyzing a large sample of randomly sheared reads we obtained a library of the repetitive sequences of E. tef. The approach we used was designed to avoid underestimation of repeat contribution; such underestimation is characteristic of whole genome assembly projects. The data collected represent a valuable resource for further analysis of the genome of this important orphan crop.Electronic supplementary materialThe online version of this article (doi:10.1186/s12870-016-0725-4) contains supplementary material, which is available to authorized users.
Highlights
IntroductionTransposable elements are usually the most abundant repetitive component in plant genomes
Eragrostis tef is an allotetraploid (2n = 4 × = 40) annual, C4 grass with an estimated nuclear genome size of 730 Mbp
The most represented transposable elements (TEs) class in the repetitive library was that of Long Terminal Repeat Retroelements (LTR-reverse transcriptase (RT)) accounting for 31.82 % of the entries
Summary
Transposable elements are usually the most abundant repetitive component in plant genomes They contribute to genome size variation, cause mutations, can result in chromosomal rearrangements, and influence gene regulation. The variation does not correlate with the biological complexity of the organisms; gene content remains quite similar across different species. This phenomenon has been described as the “C-value paradox” where the 1C DNA value is the quantity of DNA in a gamete [1]. Genome size variation is extremely evident in plants spanning at least three orders of magnitude between the 1C DNA content genome of Genslisea margaretae (58.68 Mb) [2] and the Repetitive sequences include: tandem-arranged satellite sequences, telomeric sequences, microsatellite sequences, ribosomal genes, and transposable elements (TEs) [5]. Depending on the mechanism adopted during transposition and/or to the molecule used as an intermediate, they are hierarchically
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.