Abstract
BackgroundIt is generally admitted that the species tree cannot be inferred from the genetic sequences of a single gene because the evolution of different genes, and thus the gene tree topologies, may vary substantially. Gene trees can differ, for example, because of horizontal transfer events or because some of them correspond to paralogous instead of orthologous sequences. A variety of methods has been proposed to tackle the problem of the reconciliation of gene trees in order to reconstruct a species tree. When the taxa in all the trees are identical, the problem can be stated as a consensus tree problem.ResultsIn this paper we define a new method for deciding whether a unique consensus tree or multiple consensus trees can best represent a set of given phylogenetic trees. If the given trees are all congruent, they should be compatible into a single consensus tree. Otherwise, several consensus trees corresponding to divergent genetic patterns can be identified. We introduce a method optimizing the generalized score, over a set of tree partitions in order to decide whether the given set of gene trees is homogeneous or not.ConclusionsThe proposed method has been validated with simulated data (random trees organized in three topological groups) as well as with real data (bootstrap trees, homogeneous set of trees, and a set of non homogeneous gene trees of 30 E. Coli strains; it is worth noting that some of the latter genes underwent horizontal gene transfers). A computer program, MCT - Multiple Consensus Trees, written in C was made freely available for the research community (it can be downloaded from http://bioinformatics.lif.univ-mrs.fr/consensus/index.html). It handles trees in a standard Newick format, builds three hierarchies corresponding to RF and QS similarities between trees and the greedy ascending algorithm. The generalized score values of all tree partitions are computed.
Highlights
It is generally admitted that the species tree cannot be inferred from the genetic sequences of a single gene because the evolution of different genes, and the gene tree topologies, may vary substantially
The comparison of gene trees and their assembling in a unique tree representing the species tree is a fundamental problem in phylogeny
Several consensus strategies can be used [5]. We focus on those proceeding the same way : an X-tree is considered as a set of bipartitions, each one corresponding to an internal edge of the tree, the external ones connecting the leaves to the tree
Summary
It is generally admitted that the species tree cannot be inferred from the genetic sequences of a single gene because the evolution of different genes, and the gene tree topologies, may vary substantially. For example, because of horizontal transfer events or because some of them correspond to paralogous instead of orthologous sequences. A variety of methods has been proposed to tackle the problem of the reconciliation of gene trees in order to reconstruct a species tree. The comparison of gene trees and their assembling in a unique tree representing the species tree is a fundamental problem in phylogeny. A large variety of methods have been proposed to tackle the problem of the reconciliation of gene trees in order to reconstruct a species tree. According to an alignment of sequences from a set X of n taxa, each gene gives an unrooted X-tree [2]; X is the set of leaves, the internal nodes have degree 3 and the
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.