Abstract

We explored the feasibility of using unaligned rRNA gene sequences as DNA barcodes, based on correlation analysis of composition vectors (CVs) derived from nucleotide strings. We tested this method with seven rRNA (including 12, 16, 18, 26 and 28S) datasets from a wide variety of organisms (from archaea to tetrapods) at taxonomic levels ranging from class to species. Our results indicate that grouping of taxa based on CV analysis is always in good agreement with the phylogenetic trees generated by traditional approaches, although in some cases the relationships among the higher systemic groups may differ. The effectiveness of our analysis might be related to the length and divergence among sequences in a dataset. Nevertheless, the correct grouping of sequences and accurate assignment of unknown taxa make our analysis a reliable and convenient approach in analyzing unaligned sequence datasets of various rRNAs for barcoding purposes. The newly designed software (CVTree 1.0) is publicly available at the Composition Vector Tree (CVTree) web server http://cvtree.cbi.pku.edu.cn.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call