Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not Enough

Hervé Philippe,Henner Brinkmann,Gert Wörheide,Michael Manuel,Dennis V Lavrov,Denis Baurain,D Timothy J Littlewood

doi:10.1371/journal.pbio.1000602

Abstract

In the quest to reconstruct the Tree of Life, researchers have increasingly turned to phylogenomics, the inference of phylogenetic relationships using genome-scale data (Box 1). Mesmerized by the sustained increase in sequencing throughput, many phylogeneticists entertained the hope that the incongruence frequently observed in studies using single or a few genes [1] would come to an end with the generation of large multigene datasets. Yet, as so often happens, reality has turned out to be far more complex, as three recent large-scale analyses, one published in PLoS Biology [2]–[4], make clear. The studies, which deal with the early diversification of animals, produced highly incongruent (Box 2) findings despite the use of considerable sequence data (see Figure 1). Clearly, merely adding more sequences is not enough to resolve the inconsistencies.

Highlights

Taking these three studies as a case in point, we discuss pitfalls that the simple addition of sequences cannot avoid, and show how the observed incongruence can be largely overcome and how improved bioinformatics methods can help reveal the full potential of phylogenomics
Non-phylogenetic signal can be reduced by improving (i) the quality of primary alignments through selection of the orthologous genes that are least subject to saturation and (ii) the detection of multiple substitutions, which is best achieved by using both a large number of species and the most realistic model of sequence evolution. We show that both improvements are required at the same time to address the difficult question of the relationships among major animal groups, i.e., sponges, placozoans, ctenophores, cnidarians, and bilaterians
The topology we infer from the revised alignments is similar to the published tree [4], with only three nodes differing out of 21. This demonstrates that phylogenomics is relatively robust to the possible inclusion of non-orthologous sequences when the genuine phylogenetic signal is abundant, which can be explained by the randomness of most of the introduced errors preventing the appearance of a structured misleading signal

Summary

Hurdles to Phylogenomics

Two factors contribute significantly to the difficulty of reconstructing the correct phylogenetic tree for a set of sequences. Even if conflicting gene genealogies were not an issue, throwing additional gene sequences at a difficult phylogenetic question does not necessarily solve the problem—the size of the needle is increased, but so too is the size of the haystack It follows that nonphylogenetic signal may become dominant and yield incongruent, yet statistically highly supported, phylogenomic trees [12]. Non-phylogenetic signal can be reduced by improving (i) the quality of primary alignments through selection of the orthologous genes that are least subject to saturation and (ii) the detection of multiple substitutions, which is best achieved by using both a large number of species and the most realistic model of sequence evolution. Reanalysis of the underlying data indicates that failure to apply one or more of the strategies intended to decrease non-phylogenetic signal is what caused the incongruent, though strongly supported, results that were recently observed [2,3,4]

Issues at the Level of Sequence Alignments

Issues at the Level of Taxon Sampling

Issues at the Level of Tree Reconstruction Methods

Issues at the Level of Gene Sampling

Conclusion

Findings

Supporting Information

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS Biology	Publication Date: Mar 15, 2011
Citations: 1015	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not Enough

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Biology

Lead the way for us

Similar Papers

Heterogeneous and dynamic marine shelf oxygenation and coupled early animal evolution.
Maoyan Zhu ... Chao Li
Emerging Topics in Life Sciences | VOL. 2
Maoyan Zhu, et. al.Maoyan Zhu ... Chao Li
29 Jun 2018
Emerging Topics in Life Sciences | VOL. 2

Genomics of emerging infectious disease: A PLoS collection.
Jonathan A Eisen ... Catriona J Maccallum
PLoS Biology | VOL. 7
Jonathan A Eisen, et. al.Jonathan A Eisen ... Catriona J Maccallum
26 Oct 2009
PLoS Biology | VOL. 7

Statistics and Truth in Phylogenomics
S Kumar ... A J Filipski
Molecular Biology and Evolution | VOL. 29
S Kumar, et. al.S Kumar ... A J Filipski
26 Aug 2011
Molecular Biology and Evolution | VOL. 29

The grand scheme of life: The Crucible of Creation: The Burgess Shale and the Rise of Animals by Simon Conway Morris
Gregory A Wray
Trends in Genetics | VOL. 15
Gregory A WrayGregory A Wray
01 Feb 1999
Trends in Genetics | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not Enough

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Biology