Abstract

BackgroundComparative DNA sequence analysis provides insight into evolution and helps construct a natural classification reflecting the Tree of Life. The growing numbers of organisms represented in DNA databases challenge tree-building techniques and the vertical hierarchical classification may obscure relationships among some groups. Approaches that can incorporate sequence data from large numbers of taxa and enable visualization of affinities across groups are desirable.Methodology/Principal FindingsToward this end, we developed a procedure for extracting diagnostic patterns in the form of indicator vectors from DNA sequences of taxonomic groups. In the present instance the indicator vectors were derived from mitochondrial cytochrome c oxidase I (COI) sequences of those groups and further analyzed on this basis. In the first example, indicator vectors for birds, fish, and butterflies were constructed from a training set of COI sequences, then correlations with test sequences not used to construct the indicator vector were determined. In all cases, correlation with the indicator vector correctly assigned test sequences to their proper group. In the second example, this approach was explored at the species level within the bird grouping; this also gave correct assignment, suggesting the possibility of automated procedures for classification at various taxonomic levels. A false-color matrix of vector correlations displayed affinities among species consistent with higher-order taxonomy.Conclusions/SignificanceThe indicator vectors preserved DNA character information and provided quantitative measures of correlations among taxonomic groups. This method is scalable to the largest datasets envisioned in this field, provides a visually-intuitive display that captures relational affinities derived from sequence data across a diversity of life forms, and is potentially a useful complement to current tree-building techniques for studying evolutionary processes based on DNA sequence data.

Highlights

  • As Carl Woese first demonstrated over 30 years ago, the evolutionary history of organisms is embedded in their DNA [1]

  • The general approach to extracting phylogenetic information from DNA is the same as for morphologic analysis-arranging organisms in nested groups defined by synapomorphies, shared characters that represent a common evolutionary history [4] (Here and in the following the usage of group refers to taxonomic group.)

  • We focus on the 648 nucleotide region of cytochrome c oxidase subunit I (COI) gene, employed as a standard ‘‘DNA barcode’’ for distinguishing animal species [11], and utilize records in Barcode of Life Database (BOLD) http://www.barcodinglife.org [12]

Read more

Summary

Introduction

As Carl Woese first demonstrated over 30 years ago, the evolutionary history of organisms is embedded in their DNA [1]. The patterning of ancient divergences that led to present-day forms can be reconstructed by comparing homologous sequences from different organisms, thereby establishing a natural classification in the form of a Tree of Life that reflects evolutionary history [2]. A tree diagram aims to express the temporal patterning of divergences and as such does not convey relative affinities among or within groups, such as might be due to positive or negative selection including convergent evolution. For these reasons, it is desirable explore complements to tree-based methods for analyzing and displaying DNA sequences from large numbers of organisms. Approaches that can incorporate sequence data from large numbers of taxa and enable visualization of affinities across groups are desirable

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.