On the artefactual parasitic eubacteria clan in conditioned logdet phylogenies: heterotachy and ortholog identification artefacts as explanations

Ajanthah Sangaralingam,Edward Susko,Matthew Spencer,David Bryant

doi:10.1186/1471-2148-10-343

Abstract

BackgroundPhylogenetic reconstruction methods based on gene content often place all the parasitic and endosymbiotic eubacteria (parasites for short) together in a clan. Many other lines of evidence point to this parasites clan being an artefact. This artefact could be a consequence of the methods used to construct ortholog databases (due to some unknown bias), the methods used to estimate the phylogeny, or both.We test the idea that the parasites clan is an ortholog identification artefact by analyzing three different ortholog databases (COG, TRIBES, and OFAM), which were constructed using different methods, and are thus unlikely to share the same biases. In each case, we estimate a phylogeny using an improved version of the conditioned logdet distance method. If the parasites clan appears in trees from all three databases, it is unlikely to be an ortholog identification artefact.Accelerated loss of a subset of gene families in parasites (a form of heterotachy) may contribute to the difficulty of estimating a phylogeny from gene content data. We test the idea that heterotachy is the underlying reason for the estimation of an artefactual parasites clan by applying two different mixture models (phylogenetic and non-phylogenetic), in combination with conditioned logdet. In these models, there are two categories of gene families, one of which has accelerated loss in parasites. Distances are estimated separately from each category by conditioned logdet. This should reduce the tendency for tree estimation methods to group the parasites together, if heterotachy is the underlying reason for estimation of the parasites clan.ResultsThe parasites clan appears in conditioned logdet trees estimated from all three databases. This makes it less likely to be an artefact of database construction. The non-phylogenetic mixture model gives trees without a parasites clan. However, the phylogenetic mixture model still results in a tree with a parasites clan. Thus, it is not entirely clear whether heterotachy is the underlying reason for the estimation of a parasites clan. Simulation studies suggest that the phylogenetic mixture model approach may be unsuccessful because the model of gene family gain and loss it uses does not adequately describe the real data.ConclusionsThe most successful methods for estimating a reliable phylogenetic tree for parasitic and endosymbiotic eubacteria from gene content data are still ad-hoc approaches such as the SHOT distance method. however, the improved conditioned logdet method we developed here may be useful for non-parasites and can be accessed at http://www.liv.ac.uk/~cgrbios/cond_logdet.html

Highlights

Phylogenetic reconstruction methods based on gene content often place all the parasitic and endosymbiotic eubacteria together in a clan
Phylogenetic reconstruction methods based on sequence data have difficulty in accounting for events such as genome fusion and horizontal gene transfer that occur during evolution [1]
A potential limitation is that the 16S tree which we used as a reference tree is a widely used standard tree for bacterial phylogenetics, it may itself have been affected by horizontal gene transfer, and often conflicts with both trees for individual protein coding genes [38], and trees based on the concatenated alignments of many genes [18]

Summary

Introduction

Phylogenetic reconstruction methods based on gene content often place all the parasitic and endosymbiotic eubacteria (parasites for short) together in a clan. We test the idea that heterotachy is the underlying reason for the estimation of an artefactual parasites clan by applying two different mixture models (phylogenetic and nonphylogenetic), in combination with conditioned logdet. In these models, there are two categories of gene families, one of which has accelerated loss in parasites. SHOT (SHared Ortholog and gene order Tree reconstruction tool, [4]) is a method for estimating gene content phylogeny which avoids some of the problems with genome size variation by ignoring shared absences It is not based on any specific model of evolution. It is difficult to trust any results from SHOT that are not supported by methods with better statistical properties, and a consistent statistical method for estimating phylogenies from gene content data remains desirable

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Evolutionary Biology	Publication Date: Nov 9, 2010
Citations: 45	License type: cc-by

R Discovery Prime

R Discovery Prime

On the artefactual parasitic eubacteria clan in conditioned logdet phylogenies: heterotachy and ortholog identification artefacts as explanations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Evolutionary Biology

Lead the way for us

Similar Papers

When Do Phylogenetic Mixture Models Mimic Other Phylogenetic Models?
Elizabeth S Allman ... John A Rhodes
Systematic Biology | VOL. 61
Elizabeth S Allman, et. al.Elizabeth S Allman ... John A Rhodes
10 Sep 2012
Systematic Biology | VOL. 61

Phylogenetic mixture models for proteins
Si Quang Le ... Olivier Gascuel
Philosophical Transactions of the Royal Society B: Biological Sciences | VOL. 363
Si Quang Le, et. al.Si Quang Le ... Olivier Gascuel
07 Oct 2008
Philosophical Transactions of the Royal Society B: Biological Sciences | VOL. 363

A Phylogenetic Mixture Model for Gene Family Loss in Parasitic Bacteria
Matthew Spencer ... Ajanthah Sangaralingam
Molecular Biology and Evolution | VOL. 26
Matthew Spencer, et. al.Matthew Spencer ... Ajanthah Sangaralingam
12 May 2009
Molecular Biology and Evolution | VOL. 26

Performance of Akaike Information Criterion and Bayesian Information Criterion in Selecting Partition Models and Mixture Models.
Qin Liu ... Shane A Richards
Systematic Biology | VOL. 72
Qin Liu, et. al.Qin Liu ... Shane A Richards
28 Dec 2022
Systematic Biology | VOL. 72

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the artefactual parasitic eubacteria clan in conditioned logdet phylogenies: heterotachy and ortholog identification artefacts as explanations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Evolutionary Biology