An automated homology-based approach for identifying transposable elements

Ryan C Kennedy,Frank H Collins,Gregory R Madey,Maria F Unger,Scott Christley

doi:10.1186/1471-2105-12-130

Ryan C Kennedy, Frank H Collins + Show 3 more

Open Access

https://doi.org/10.1186/1471-2105-12-130

Copy DOI

Abstract

BackgroundTransposable elements (TEs) are mobile sequences found in nearly all eukaryotic genomes. They have the ability to move and replicate within a genome, often influencing genome evolution and gene expression. The identification of TEs is an important part of every genome project. The number of sequenced genomes is rapidly rising, and the need to identify TEs within them is also growing. The ability to do this automatically and effectively in a manner similar to the methods used for genes is of increasing importance. There exist many difficulties in identifying TEs, including their tendency to degrade over time and that many do not adhere to a conserved structure. In this work, we describe a homology-based approach for the automatic identification of high-quality consensus TEs, aimed for use in the analysis of newly sequenced genomes.ResultsWe describe a homology-based approach for the automatic identification of TEs in genomes. Our modular approach is dependent on a thorough and high-quality library of representative TEs. The implementation of the approach, named TESeeker, is BLAST-based, but also makes use of the CAP3 assembly program and the ClustalW2 multiple sequence alignment tool, as well as numerous BioPerl scripts. We apply our approach to newly sequenced genomes and successfully identify consensus TEs that are up to 99% identical to manually annotated TEs.ConclusionsWhile TEs are known to be a major force in the evolution of genomes, the automatic identification of TEs in genomes is far from mature. In particular, there is a lack of automated homology-based approaches that produce high-quality TEs. Our approach is able to generate high-quality consensus TE sequences automatically, requiring the user to only provide a few basic parameters. This approach is intentionally modular, allowing researchers to use components separately or iteratively. Our approach is most effective for TEs with intact reading frames. The implementation, TESeeker, is available for download as a virtual appliance, while the library of representative TEs is available as a separate download.

Highlights

Transposable elements (TEs) are mobile sequences found in most eukaryotic genomes
For Class I elements, the library consists of 227 long terminal repeats (LTRs) amino acid sequences representing the cer1, copia, csrn1, Cyclops, gypsy, mag, mdg1, mdg3, osvaldo, Pao/Bel, and Ty3 families as well as 49 non-LTR amino acid sequences representing the CR1, I, Jockey, L1, L2, LOA, Loner, Outcast, R1, R4, and RTE families
Identify Complete TE To validate and improve the consensus sequence, we look for similar copies of it in the genome with a blastn search

Summary

Results

We describe a homology-based approach for the automatic identification of TEs in genomes. Our modular approach is dependent on a thorough and high-quality library of representative TEs. The implementation of the approach, named TESeeker, is BLAST-based, and makes use of the CAP3 assembly program and the ClustalW2 multiple sequence alignment tool, as well as numerous BioPerl scripts. We apply our approach to newly sequenced genomes and successfully identify consensus TEs that are up to 99% identical to manually annotated TEs

Conclusions

Background

Results and Discussion

Limitations

McClintock B: The discovery and characterization of transposable elements

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: May 3, 2011
Citations: 66	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

An automated homology-based approach for identifying transposable elements

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes
Mateusz Janicki ... Rebecca Rooke
Chromosome Research | VOL. 19
Mateusz Janicki, et. al.Mateusz Janicki ... Rebecca Rooke
01 Aug 2011
Chromosome Research | VOL. 19

Characterization and functional annotation of nested transposable elements in eukaryotic genomes
Caihua Gao ... Jiana Li
Genomics | VOL. 100
Caihua Gao, et. al.Caihua Gao ... Jiana Li
15 Jul 2012
Genomics | VOL. 100

Computational approaches for identification and classification of transposable elements in eukaryotic genomes
Hong-En Xu ... Zhong-Huai Xiang
Hereditas (Beijing) | VOL. 34
Hong-En Xu, et. al.Hong-En Xu ... Zhong-Huai Xiang
28 Aug 2012
Hereditas (Beijing) | VOL. 34

Distribution patterns and impact of transposable elements in genes of green algae
Gisele S Philippsen ... Ricardo Demarco
Gene | VOL. 594
Gisele S Philippsen, et. al.Gisele S Philippsen ... Ricardo Demarco
07 Sep 2016
Gene | VOL. 594

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An automated homology-based approach for identifying transposable elements

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics