Abstract

BackgroundTaxon sampling is a major concern in phylogenetic studies. Incomplete, biased, or improper taxon sampling can lead to misleading results in reconstructing evolutionary relationships. Several theoretical methods are available to optimize taxon choice in phylogenetic analyses. However, most involve some knowledge about the genetic relationships of the group of interest (i.e., the ingroup), or even a well-established phylogeny itself; these data are not always available in general phylogenetic applications.ResultsWe propose a new method to assess taxon sampling developing Clarke and Warwick statistics. This method aims to measure the "phylogenetic representativeness" of a given sample or set of samples and it is based entirely on the pre-existing available taxonomy of the ingroup, which is commonly known to investigators. Moreover, our method also accounts for instability and discordance in taxonomies. A Python-based script suite, called PhyRe, has been developed to implement all analyses we describe in this paper.ConclusionsWe show that this method is sensitive and allows direct discrimination between representative and unrepresentative samples. It is also informative about the addition of taxa to improve taxonomic coverage of the ingroup. Provided that the investigators' expertise is mandatory in this field, phylogenetic representativeness makes up an objective touchstone in planning phylogenetic studies.

Highlights

  • Taxon sampling is a major concern in phylogenetic studies

  • The study of phylogenetics has a long tradition in evolutionary biology and countless statistical, mathematical, and bioinformatic approaches have been developed to deal with the increasing amount of available data

  • Two cases are trivial: when t = 1, S(t) equals to S; when t = T, S(t) equals to 1

Read more

Summary

Introduction

Taxon sampling is a major concern in phylogenetic studies. Incomplete, biased, or improper taxon sampling can lead to misleading results in reconstructing evolutionary relationships. Several theoretical methods are available to optimize taxon choice in phylogenetic analyses. The study of phylogenetics has a long tradition in evolutionary biology and countless statistical, mathematical, and bioinformatic approaches have been developed to deal with the increasing amount of available data. A few species are chosen to represent a family or another high-level taxon, or a few. This issue is rarely formally addressed and generally treated in a rather subjective way; this is one of the most frequent ways incongruent phylogenetic results are accounted for. Many theoretical approaches have been proposed to drive taxon sampling: see [[7]; and reference therein] for a keystone review

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.