Abstract

The gene encoding SSU-rRNA sequences is the tool of choice for phylogenetic analyses and environmental biodiversity analyses of bacteria, Archaea but also unicellular Eukaryota. In Eukaryota, gene sequences may often be interrupted by long or several introns. Searching in GenBank release 188, we found descriptions of 3638 such sequences. Using a database of 180 000 SSU-rRNA sequences well annotated for taxonomy and a C++ program written for that purpose, we computed the presence of 18 691 introns (among which the 3638 described introns). Filtering on length and sequence quality, 3646 sequences were retained. These introns were clustered; clusters were analyzed for the presence of single or multiple clades at various levels of taxonomic depth, allowing future analyses of horizontal transfers. Various analyses of the results are provided as tabulated files as well as FASTA files of described or computed introns. Each sequence is annotated for cellular location (nuclear, chloroplast, and mitochondria), positions at which they were found in the SSU-rRNA sequences and taxonomy as provided by GenBank.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.