Abstract

BackgroundAnalyzed individually, gene trees for a given taxon set tend to harbour incongruent or conflicting signals. One popular approach to deal with this circumstance is to use concatenated data. But especially in prokaryotes, where lateral gene transfer (LGT) is a natural mechanism of generating genetic diversity, there are open questions as to whether concatenation amplifies or averages phylogenetic signals residing in individual genes. Here we investigate concatenations of prokaryotic and eukaryotic datasets to investigate possible sources of incongruence in phylogenetic trees and to examine the level of overlap between individual and concatenated alignments.ResultsWe analyzed prokaryotic datasets comprising 248 invidual gene trees from 315 genomes at three taxonomic depths spanning gammaproteobacteria, proteobacteria, and prokaryotes (bacteria plus archaea), and eukaryotic datasets comprising 279 invidual gene trees from 85 genomes at two taxonomic depths: across plants-animals-fungi and within fungi. Consistent with previous findings, the branches in trees made from concatenated alignments are, in general, not supported by any of their underlying individual gene trees, even though the concatenation trees tend to possess high bootstrap proportions values. For the prokaryote data, this observation is independent of phylogenetic depth and sequence conservation. The eukaryotic data show much better agreement between concatenation and single gene trees. LGT frequencies in trees were estimated using established methods. Sequence length in individual alignments, but not sequence divergence, was found to correlate with the generation of branches that correspond to the concatenated tree.ConclusionsThe weak correspondence of concatenation trees with single gene trees gives rise to the question where the phylogenetic signal in concatenated trees is coming from. The eukaryote data reveals a better correspondence between individual and concatenation trees than the prokaryote data. The question of whether the lack of correspondence between individual genes and the concatenation tree in the prokaryotic data is due to LGT or phylogenetic artefacts remains unanswered. If LGT is the cause of incongruence between concatenation and individual trees, we would have expected to see greater degrees of incongruence for more divergent prokaryotic data sets, which was not observed, although estimated rates of LGT suggest that LGT is responsible for at least some of the observed incongruence.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-014-0266-0) contains supplementary material, which is available to authorized users.

Highlights

  • IntroductionGene trees for a given taxon set tend to harbour incongruent or conflicting signals

  • Analyzed individually, gene trees for a given taxon set tend to harbour incongruent or conflicting signals

  • In each 100 genome, 48 gene sample, the frequency of branches in 48 individual gene trees were compared to the set of branches in the concatenation tree

Read more

Summary

Introduction

Gene trees for a given taxon set tend to harbour incongruent or conflicting signals. Williams and Embley [8] reinspected that data and found that the sequence collection procedure used by Rinke et al [7] had included several nuclear genes of mitochondrial and plastid origin among the eukaryotic sequences; when those were removed and replaced by eukaryotic nuclear genes that had not been acquired from mitochondria or plastids, the two-domain tree was obtained [8], in which eukaryotes branch within the archaea [9] Another source of conflict is phylogenetic error due to unknown factors that are often subsumed into the term model misspecification. The reason for the differing results are best explained by the circumstance that different proteins undergo amino acid substitution in different ways over evolutionary time, and according to different processes, models for which can be approximated mathematically [10,11]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call