Coalescence methods have emerged as an alternative toconcatenation methods for reconstructing species trees[1,2]. Zhong et al. [3] advocated the coalescence approachfor resolving early branching events in plant phylogeny.We show that different coalescence methods yield discor-dant results and call attention to fundamental problemswith the application of coalescence to deep phylogeneticquestions such as the origin of land plants.Streptophyte algae (Figure 1) are the closest livingrelatives of land plants, but elucidation of the lineage thatrepresents the sister group to land plants remains contro-versial [3]. Zhong et al. [3] argued that concatenationanalyses bearing on land plant origins are fundamentallyflawed owing to conflicts among gene trees and suggestedthat coalescence methods that accommodate gene treeheterogeneity can overcome these problems. Zhong et al.[3] applied one of these methods, MP-EST [2], to sixphylogenomic data sets (42 to 289 genes) and uniformlyrecovered Zygnematalesasthesistergrouptolandplants(Figure 1A), albeit with low bootstrap support (28–65%).The authors concluded that Zygnematales is the sistergroup of land plants based on consistent MP-EST sup-port. However, uniform results should not be equatedwith accuracy and instead may reflect methodologicalbiases [4].A conspicuous feature of Zhong et al.’s [3] data set ismissing sequences; for example, 119 genes in the preferred184 gene data set were unsampled for the same 24 taxaincluding all three outgroups. This potentially debilitatingcircumstance required rooting >100 gene trees within theingroup for these trees (Figure 1). Liu et al. (pp. 4–5 in [2])cautioned that,forMP-EST,missinglineages‘insomegenetrees areallowediflineagesaremissingrandomly,butalotof missing lineages may dramatically reduce the perfor-mance of the pseudo-likelihood approach.’ Zhong et al. [3]ignored this advice and their data sets are plagued withmassive blocks of nonrandomly distributed, missingsequences (Figure 1).We performed coalescence analyses on the six phyloge-nomic data sets from Zhong et al. [3] with another method,STAR [1], andrecoveredZygnematales +Coleochaetales asthe sistergrouptolandplantsinfiveofsixcases(Figure1B).Given the manifest disagreement between MP-EST andSTAR, these results undermine Zhong et al.’s [3] majorconclusions regarding land plant origins and the utility ofcoalescence methods for deep phylogenetic problems.MP-EST and STAR are both statistically consistentmethods when their underlying assumptions are upheldand in these instances may yield more accurate speciestreesthanconcatenation[1,2].However,theoreticalguar-antees are empty when assumptions are violated andshould be trumped by empirical performance. Zhonget al.[3]extolledthevirtuesofcoalescencemethodsbasedonsimulations,buttoourknowledgetheonlysimulationswhere MP-EST or STAR outperformed concatenation in-volveasingletreethatissmall(5taxa),asymmetrical,andshallow (3.1 coalescent units fromroot to tip) [1,2]. Simu-lations that have modeled ancient diversifications andlarger sets of taxa have uniformly favored concatenationto‘shortcut’coalescencemethods(STAR,MP-EST,STEM,STEAC, MDC) [5–8] even when data were simulated viacoalescencemodels[6–8]. Pateletal.[7]foundthatSTEMoutperformedconcatenationwithtruegenetrees,buttruegene trees are unknown outside simulationsand insteadgene trees must be inferred from sequences. The poorperformance of coalescence methods [5–8] presumablyreflects theirincorrectassumptionthat allconflictamonggene trees is attributable to deep coalescence, whereas amultitude of other problems (long branches, mutationalsaturation,weakphylogeneticsignal, modelmisspecifica-tion, poor taxon sampling) negatively impact reconstruc-tion of accurate gene trees and provide more cogentexplanations for incongruence [6,7].Zhongetal.’s[3]genetreesimplyunrealisticretentionsof ancestral polymorphism that in one case exceed 500million years [9] (Figure 1C). A consequence of highlyinaccurategenetreesforcoalescenceanalysesisa speciestrees with branch lengths (in coalescent units) that areimpossibly short (Figure 1D). The interval that extendsfrom the common ancestor of green plants to the commonancestor of land plants encompasses >500 million years(MY)(Figure1).Yet,inZhongetal.’s[3]MP-ESTtree,thislineage spans only 3.9 coalescent units, which is equiva-lent to 0.4–19.2 MY for an annual plant with N
Read full abstract