Accuracy of ancestral state reconstruction for non-neutral traits

  • Abstract
  • Highlights & Summary
  • PDF
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

The assumptions underpinning ancestral state reconstruction are violated in many evolutionary systems, especially for traits under directional selection. However, the accuracy of ancestral state reconstruction for non-neutral traits is poorly understood. To investigate the accuracy of ancestral state reconstruction methods, trees and binary characters were simulated under the BiSSE (Binary State Speciation and Extinction) model using a wide range of character-state-dependent rates of speciation, extinction and character-state transition. We used maximum parsimony (MP), BiSSE and two-state Markov (Mk2) models to reconstruct ancestral states. Under each method, error rates increased with node depth, true number of state transitions, and rates of state transition and extinction; exceeding 30% for the deepest 10% of nodes and highest rates of extinction and character-state transition. Where rates of character-state transition were asymmetrical, error rates were greater when the rate away from the ancestral state was largest. Preferential extinction of species with the ancestral character state also led to higher error rates. BiSSE outperformed Mk2 in all scenarios where either speciation or extinction was state dependent and outperformed MP under most conditions. MP outperformed Mk2 in most scenarios except when the rates of character-state transition and/or extinction were highly asymmetrical and the ancestral state was unfavoured.

Similar Papers
  • Research Article
  • Cite Count Icon 27
  • 10.4137/ebo.s39732
Notes on the Statistical Power of the Binary State Speciation and Extinction (BiSSE) Model
  • Jan 1, 2016
  • Evolutionary Bioinformatics Online
  • Alexander Gamisch

The Binary State Speciation and Extinction (BiSSE) method is one of the most popular tools for investigating the rates of diversification and character evolution. Yet, based on previous simulation studies, it is commonly held that the BiSSE method requires phylogenetic trees of fairly large sample sizes (>300 taxa) in order to distinguish between the different models of speciation, extinction, or transition rate asymmetry. Here, the power of the BiSSE method is reevaluated by simulating trees of both small and large sample sizes (30, 60, 90, and 300 taxa) under various asymmetry models and root state assumptions. Results show that the power of the BiSSE method can be much higher, also in trees of small sample size, for detecting differences in speciation rate asymmetry than anticipated earlier. This, however, is not a consequence of any conceptual or mathematical flaw in the method per se but rather of assumptions about the character state at the root of the simulated trees and thus the underlying macroevolutionary model, which led to biased results and conclusions in earlier power assessments. As such, these earlier simulation studies used to determine the power of BiSSE were not incorrect but biased, leading to an overestimation of type-II statistical error for detecting differences in speciation rate but not for extinction and transition rates.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 11
  • 10.1186/s12862-018-1174-5
Binary-state speciation and extinction method is conditionally robust to realistic violations of its assumptions
  • May 8, 2018
  • BMC Evolutionary Biology
  • Andrew G Simpson + 3 more

BackgroundPhylogenetic comparative methods allow us to test evolutionary hypotheses without the benefit of an extensive fossil record. These methods, however, make simplifying assumptions, among them that clades are always increasing or stable in diversity, an assumption we know to be false. This study simulates hypothetical clades to test whether the Binary State Speciation and Extinction (BiSSE) method can be used to correctly detect relative differences in diversification rate between ancestral and derived character states even as net diversification rates are declining overall. We simulate clades with declining but positive diversification rates, as well those in which speciation rates decline below extinction rates so that they are losing richness for part of their history. We run these analyses both with simulated symmetric and asymmetric speciation rates to test whether BiSSE can be used to detect them correctly.ResultsFor simulations with a neutral character, the fit for a BiSSE model with a neutral character is better than alternative models so long as net diversification rates remain positive. Once net diversification rates become negative, the BiSSE model with the greatest likelihood often has a non-neutral character, even though there is no such character in the simulation. BiSSE’s usefulness in detecting real asymmetry in speciation rates improves with clade age, even well after net diversification rates have become negative.ConclusionsBiSSE is most useful in analyzing clades of intermediate age, before they have reached peak diversity and gone into decline. After this point, users of BiSSE risk incorrectly inferring differential evolutionary rates when none exist. Fortunately, most studies using BiSSE and similar models focus on rapid, recent diversifications, and are less likely to encounter the biases BiSSE models are subject to for older clades. For extant groups that were once more diverse than now, however, caution should be taken in inferring past diversification patterns without fossil data.

  • Discussion
  • Cite Count Icon 17
  • 10.1002/ajb2.1278
Diversity and skepticism are vital for comparative biology: aresponse to Donoghue and Edwards (2019).
  • May 1, 2019
  • American Journal of Botany
  • Jeremy M Beaulieu + 1 more

Several objections raised by Donoghue and Edwards (2019) regarding a recent paper of ours (Beaulieu and O'Meara, 2018) provide a good opportunity to discuss the different uses of phylogenies to understand evolutionary history. Are small-scale studies more informative and prone to less error? Or, as we argue, do studies at different scales carry their own benefits and costs? Is ascertainment bias, a term we use to describe how individual clades chosen for study may represent a biased sample of life, a potentially critical problem for macroevolutionary studies? Donoghue and Edwards (2019) recently argued that the ascertainment bias we highlighted is not as problematic as we claimed. In fact, they pointed to our own simulations as evidence, which showed that despite the increased variance, analyses of many smaller, variable subclades produced, on average, reasonable estimated rates of the larger and more inclusive clade. We disagree with their interpretation and expand on the ways ascertainment bias may affect our conclusions about the mode and tempo of evolution, and we also highlight areas where more research is needed. Finally, there is a disagreement over whether it is more informative to understand parameters of a process (e.g., rate of trait evolution) or inferred details of a particular history (e.g., ancestral states). We address these issues in reverse order, as each builds toward the next. A key difference between Donoghue and Edwards (2019) and our view centers on the value of parameter estimation. For example, they are critical that the estimation of a particular rate of, say 42%, is informative (p. 329) and that a "global rate" should not be a goal of comparative biology. These comments reflect two important and undervalued issues that are common in the field: (1) parameters in comparative models typically have units and (2) interpretation of these parameters in the context of the underlying model. First, we agree with Donoghue and Edwards (2019) in that a "rate of 42%" has very little, if any, meaning. A percentage is, in fact, not a rate at all. As we (e.g., Beaulieu and O'Meara, 2015) and others (e.g., Hansen, 1997) have argued, rates have units (e.g., state transitions per million years, speciation events per million years, accumulated variance per million years) and thus should be discussed in those terms. We acknowledge, however, that units are often glossed over in simulations because the scale of the tree may also lack units (e.g., branch lengths in millions of years). In any event, knowing whether a particular trait (e.g., selfing) has evolved 42 times per million years of evolutionary history, or one hundredth of that, suggests what has driven the persistence of the different character states. For example, transition rates among a coordinated set of floral traits were low enough that it likely took tens of millions of years for flowers to evolve a diversity-accelerating combination of bilateral symmetry, few stamens, and showy petals (O'Meara, Smith et al., 2016). And, as a result, angiosperms as a whole are not yet at equilibrium with respect to the diversity of their floral trait combinations. Such a discovery comes from understanding transition rates with units, as well as looking at a wide diversity of angiosperms. Rates can have more intuitive appeal when they are transformed, such as taking the reciprocal to express a rate in units of the expected wait time. A rate of selfing evolution of 42/Myr means a single lineage is expected to change whether it transitions to selfing every 0.024 Myr, or every 24,000 years, which is intuitively a very high rate. The same can be done with diversification rates to give expected times between speciation and extinction events. For example, the highest diversification rate across all life (in table 1a of Lagomarsino et al., 2016) is 3.07 diversification events/Myr, which means every species is expected to speciate into two every 325,000 years—a high rate, but not unreasonable for a recent, presumably ongoing, radiation. Of course, we emphasize that transition rates are not the only way to understand biology. Correlations between traits, detailed studies of ontogeny, understanding phylogenetic relationships, and so forth, are obviously important. Our use of a transition rate as an example in a simulation is different from arguing that the goal of comparative biology is to define processes purely by a single "global rate". We regret that our intention was apparently unclear. Central to evolutionary biology is to understand the myriad patterns and processes that produced the extraordinary diversity of life on Earth. Sometimes the focus is on particular, unique events (e.g., the evolution of the carpel or what happened after tarweeds made it to Hawaii), while other questions are on more general scales (e.g., how complex traits evolve, how species diversify in novel habitats). We argue that a continued push is needed to understand these processes on a more general level, while also learning from the exceptions. If one wants to go beyond natural history of a particular group to see how some observable phenomenon works in practice, constructing a model is an important next step—models help explain the world and assess whether predictions from models match reality. Otherwise, we are left with "just-so stories" (Gould, 1978). In our case, we chose to focus on transition rates to illustrate our point. Think of the wide variety of ways transitions in mutation rate, selfing rate, dispersal rate, extinction rate, rate of polyploidization, rate of gain or loss of woodiness or C4 photosynthesis have shaped biological diversity. Transition rates also tend to be heritable, but with important considerations for heterogeneity across time and taxa, which has been a focus of our research (e.g., O'Meara et al., 2006; Beaulieu et al., 2012, 2013a; O'Meara, 2012; Beaulieu and O'Meara, 2016; O'Meara, Smith et al., 2016 Caetano et al., 2018). Again, the use of transition rates is just one example of a model parameter—not the ultimate goal of biologists. Donoghue and Edwards (2019) reject the utility of models on large trees for anything but identifying focal clades and in essence promote closely examining the details of specific individual transitions observed for a focal trait. Looking at focal transitions has led to many discoveries, and we agree that it is one important way, among others, to understand biology. However, there are three important caveats readers should note for the "examine multiple transitions" approach advocated by Donoghue and Edwards (2019). First, a state transition at some point can only ever be understood through some form of an ancestral state reconstruction—that is, estimating where exactly a character switched from one state to another. A given transition might be so rare and such an obvious change that an implicit parsimony map will suffice. More rigorous approaches to get the same mapping include maximum parsimony, stochastic character mapping, or marginal or joint reconstruction from maximum likelihood or Bayesian approaches. Nevertheless, some sort of reconstruction must have been done to know where on the tree the transitions occurred, so ancestral state estimation still relies on a model. While ancestral state reconstruction methods are fraught with difficulties (e.g., Schluter et al., 1997; Cunningham et al., 1998; Cunningham, 1999; Omland, 1999; Oakley et al., 2000; Salisbury and Kim, 2001; Mossel, 2003; Lucena and Haussler, 2005; Mossel and Steel, 2005; Goldberg and Igic, 2008; Li et al., 2008, 2010; Losos, 2011; Royer-Carenzi et al., 2013; Gascuel and Steel, 2014), they will always remain a tempting enterprise. But, the important point is that, aside from parsimony (though only arguably, given, for example, the model proposed by Tuffley and Steel, 1997), ancestral state reconstruction algorithms use transition rates to produce their estimates, and so state changes cannot be understood without them. The second caveat, which is related to the first, is that by moving the primary focus to the reconstructions, we ask quite a lot of the data. Consider a 100-taxon tree: To estimate transitions, one is technically using a single value from each of the 100 species to infer not just transition rates for the model, but also the likeliest states at 99 nodes. There is also an interaction between the transition rates and ancestral state reconstructions that has been largely ignored by the field. As character change becomes increasingly labile, the underlying Markov process actually makes it increasingly difficult to infer ancestral states accurately (Sober and Steel, 2002). In the absence of fossil data from right around the transition, we may also make the implicit assumption that descendants differing in the focal trait have not undergone state change in any other trait since then. Finally, if we want to see what additional factors affect individual transitions of our focal traits, we can still use models. For example, the method of Pagel (1994) might be a good approach for testing whether high herbivore pressure leads to higher rates of becoming deciduous (but see Maddison and FitzJohn, 2015—there are significant problems with this approach). In Beaulieu and O'Meara (2018), we expressed our concerns about ascertainment bias in general. That is, perhaps by focusing on certain clades, those with variation in a trait of interest, we are often misled about general processes because we only look at peculiar subsets of life. The example we used was to focus on variable clades only (but this is not the only source of ascertainment bias, as we expand upon below and in Table 1). Donoghue and Edwards (2019) responded that ascertainment bias is not a problem in practice. First, to suggest that ascertainment bias is not a serious problem is to ignore all the ways in which it is already accounted for in many statistical and analytical tools. For example, the issue of ascertainment bias is a well-known problem in phylogenetic inference. Felsenstein (1992) first noted that when certain restriction sites are absent from all species, the entire site is omitted from the data matrix. The data set, now biased and unrepresentative because it contains only variable sites, can potentially inflate inferred branch lengths and generate erroneous trees. Felsenstein (1992) proposed a simple but clever modification to the likelihood calculation at a site. This modification was later adopted by Lewis (2001) in phylogenetic inference based on morphological matrices, which also often omit invariant characters (e.g., "cell wall present" is never a trait used in plant phylogenetics). Similar biases and associated modifications to the underlying likelihood calculations have been proposed for single nucleotide polymorphisms (Leaché et al., 2015) and restriction-site-associated DNA-restriction data sets (Peterson et al., 2012). Correcting for ascertainment bias is also central to properly estimating diversification rates (i.e., speciation and extinction). With nonzero extinction rates, there is a probability that a clade observed today could have, alternatively, gone completely extinct at some point in the past. The effect is that the clades that have survived to the present often get off to a running start initially and are, therefore, part of a distribution of clades whose diversity is not a representative sample of the typical clade sizes for a given set of speciation and extinction parameters (Nee et al., 1994; Magallón and Sanderson, 2001). A simple way of thinking about this type of ascertainment bias is to imagine a busload of patrons arriving at a casino, each clutching a shiny quarter. The patrons still playing several hours later must have had an unusually good run of luck at the beginning of the night, even though the chances of winning were the same for everyone. In the same way, not accounting for survival probability can produce spurious estimates of the diversification process. If it were just diversification rates being uniformly biased up or down, that would not be ideal, but it might be acceptable (see Rabosky et al., 2017 on this point). However, this is not the case, and in fact, we can be misled as to which clades are diversifying faster if we choose not to correct for the ascertainment bias. We can demonstrate the problem through a simple simulation (scripts available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.n9c0c8m). We generated 50 pairs of clades under a birth–death process, each with a different set of speciation and extinction rates (though with extinction never exceeding speciation), that were terminated after 20 Myr, so that we have 50 comparisons between sister groups of the same age but different true diversification rates. A common question is which of the two clades has a higher net diversification rate. Choosing at random, we would be right 50% of the time; ignoring ascertainment bias, we would be right 68% of the time, but incorporating it properly, we would be right 81% of the time. Clearly, it is better to incorporate ascertainment bias. Also note that this is just one simulation, and understanding ascertainment bias in the context of diversification is a tricky problem that is still unresolved (see Stadler, 2013 for more details). Nevertheless, our demonstration argues against dismissing it as unimportant. In Beaulieu and O'Meara (2018), one intention was to highlight the many additional types of ascertainment bias at work in most comparative analyses, each capable of producing data sets that are not representative of the overall evolutionary process. We focused on ascertainment bias of clades that exhibit variable characters, but it is worth expanding on the various kinds of ascertainment bias affecting evolutionary studies (Table 1). For instance, it is much easier to communicate evolutionary patterns if the focal group is named, like "angiosperms" (Angiospermae; Cantino et al., 2007), instead of, say, the clade that excludes Amborella and Nymphaeales. The name "angiosperms" is immediately recognizable because it denotes a morphologically distinct and seemingly more natural group of plants. The existence of a name has the unintended consequence of driving research questions that are motivated simply by the synapomorphies that define a named group (see Smith et al., 2011 for a more thorough discussion of this topic). There are additional biases that reflect other practicalities of choosing a clade for study. Clades are often chosen because they are tractable, neither too big nor too small, already have a fair bit known about them, and are often times geographically convenient. Sometimes clades are chosen because of their apparent charisma (e.g., silverswords) or because they seem atypical with respect to their observed trait variation. Extrapolating from these particular clades to processes operating more generally in less compelling clades may be problematic. We also worry that this issue is compounded by the fact that only a portion of possible clades are examined closely, while others are ignored completely. For example, we ran an analysis of the most recent 1000 papers published in the American Journal of Botany to extract any plant genera discussed anywhere in any paper (genus is the finest scale reliably retrieved, as tools involved cannot yet always identify Q. rubra as Quercus rubra; see Appendix S1 for analysis details). There is strong phylogenetic signal in which clades were even mentioned (Pagel's λ [Pagel, 1999] was 0.93), showing that botanical interests are clumped and many key groups are relatively unexamined. Donoghue and Edwards (2019) contend that extrapolations from specific clades to broader ones are not general practice. We disagree. In fact, we argue that it is standard practice for researchers to contextualize their findings by showing whether they reinforce or depart from general trends—that is, ones broadly shared across diverse species or ecosystems. A desire to uncover general patterns, and whether those patterns were generated by shared or divergent mechanisms, is the primary motivator of most scientific research. For example, Edwards et al. (2017) contextualized their detailed study of 20 Asian forest species sampled across Viburnum, combined with a broader Viburnum comparative study of 120 species, against the backdrop of biome assembly, suggesting that the relatively balanced proportion of deciduous and evergreen species they examined "may explain the massive convergence of adaptive strategies that characterizes the world's biomes." Even though Viburnum does not occur in the tundra, deserts, or deep ocean, a detailed study of their evolution may indeed help us understand evolution in these biomes. Such extrapolation is useful, as it generates predictions to test elsewhere and helps formulate broader principles of how traits and biogeography interact on evolutionary time scales. On the other hand, Beaulieu et al. (2013b), contrasted their results of a southern hemisphere origin of campanulid angiosperms and potential for Gondwana vicariance, with the findings of the many small-scale studies that suggested the break-up of Gondwana was not an important event for angiosperms as whole (e.g., Sanmartín and Ronquist, 2004). Interestingly, this prior generalization came from an aggregation of smaller studies of groups that were far too young, and not at the right phylogenetic scale, for this type of question—the question actually required a different approach and a much broader scale. Science thrives on bold ideas flung out into the world, so using a charismatic clade to make predictions about larger groups is important. However, Donoghue and Edwards (2019) seem to advocate the immoderate view that studies at larger scales are not very relevant for understanding evolution, explicitly rejecting a consensus view that studies at different scales can be complementary and valuable. In their view, "large" studies can merely identify patterns, while "small" studies can help identify mechanisms. (Note that there is no guidance on the location of the cutoff, though studies of 20 species [Edwards et al., 2017] are considered small.) We strongly object to this view. The motivation for the simulations presented in Beaulieu and O'Meara (2018) was to simply investigate the value, as well as the potential costs, of large-scale studies in dealing with many of the biases in comparative biology. We expand on these biases here (Table 1). For instance, examining larger, comprehensive clades will naturally include relatively obscure and unstudied groups often overlooked in favor of compelling, charismatic clades. Large-scale studies also provide the context by which to judge what may be exceptional at smaller scales, and they allow for exploration of patterns and for identifying locations of important changes that may or may not correspond to any formally named group. Stating that studies of different scales can all be important and that appropriate scale can relate to what questions are being addressed would normally be thought of as banalities, except for the persistent view, expressed by Donoghue and Edwards (2019) among others, about the primacy of small scale studies. We need a variety of approaches, at a diversity of scales, to truly understand evolution. There are problems with large-scale studies, but there are problems with small-scale studies, too. The type of analyses at one scale may not be useful at another. For example, at larger scales, one may have to reduce biological diversity to discrete traits with a single value for a species, whereas at a smaller scale one can embrace the variation across individuals that powers so much of evolution, at the cost of power for detecting variation across species. There has been salutary attention paid to the potential pitfalls of working with large phylogenies, but very little attention given to limits of studies of small clades. Of course, it is worth bearing in mind that the "large" vs. "small" distinction is artificial. Once one is applying comparative methods on a set of taxa to understand biology, there is a smooth continuum from a study of 20 species to one of 20,000. As scientists, we all spend our time at the margin of what is known and unknown, trying to expand the illuminated area of knowledge. Some use candles, some spotlights, and some use lasers, which we think makes for a remarkably effective glow overall. We thank Michael Donoghue for the great conversations over the years, starting when we were students, which helped form many of the ideas presented here; discussions with Erika Edwards have also proved very insightful. We are grateful to Pam Diggle for giving us the opportunity to respond and to Stacey Smith and an anonymous reviewer for their constructive feedback. We also thank Andrew Alverson, Teo Nakov, and members of the Beaulieu and O'Meara labs for the thoughtful discussions related to these topics. The data and all scripts associated with this article are available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.n9c0c8m (Beaulieu and O'Meara, 2019). Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

  • Research Article
  • Cite Count Icon 48
  • 10.1080/10635150601088995
Mapping Uncertainty and Phylogenetic Uncertainty in Ancestral Character State Reconstruction: An Example in the Moss Genus Brachytheciastrum
  • Dec 1, 2006
  • Systematic Biology
  • A Vanderpoorten + 1 more

The evolution of species traits along a phylogeny can be examined through an increasing number of possible, but not necessarily complementary, approaches. In this paper, we assess whether deriving ancestral states of discrete morphological characters from a model whose parameters are (i) optimized by ML on a most likely tree; (II) optimized by ML onto each of a Bayesian sample of trees; and (III) sampled by a MCMC visiting the space of a Bayesian sample of trees affects the reconstruction of ancestral states in the moss genus Brachytheciastrum. In the first two methods, the choice of a single- or two-rate model and of a genetic distance (wherein branch lengths are used to determine the probabilities of change) or speciational (wherein changes are only driven by speciation events) model based upon a likelihood-ratio test strongly depended on the sampled trees. Despite these differences in model selection, reconstructions of ancestral character states were strongly correlated to each others across nodes, often at r > 0.9, for all the characters. The Bayesian approach of ancestral character state reconstruction offers, however, a series of advantages over the single-tree approach or the ML model optimization on a Bayesian sample of trees because it does not involve restricting model parameters prior to reconstructing ancestral states, but rather allows a range of model parameters and ancestral character states to be sampled according to their posterior probabilities. From the distribution of the latter, conclusions on trait evolution can be made in a more satisfactorily way than when a substantial part of the uncertainty of the results is obscured by the focus on a single set of model parameters and associated ancestral states. The reconstructions of ancestral character states in Brachytheciastrum reveal rampant parallel morphological evolution. Most species previously described based on phenetic grounds are thus resolved of polyphyletic origin. Species polyphylly has been increasingly reported among mosses, raising severe reservations regarding current species definition.

  • Research Article
  • Cite Count Icon 64
  • 10.1093/sysbio/syr124
Effects of Phylogenetic Signal on Ancestral State Reconstruction
  • Jan 4, 2012
  • Systematic Biology
  • Glenn Litsios + 1 more

One of the standard tools used to understand the processes shaping trait evolution along the branches of a phylogenetic tree is the reconstruction of ancestral states (Pagel 1999). The purpose is to estimate the values of the trait of interest for every internal node of a phylogenetic tree based on the trait values of the extant species, a topology and, depending on the method used, branch lengths and a model of trait evolution (Ronquist 2004). This approach has been used in a variety of contexts such as biogeography (e.g., Nepokroeff et al. 2003, Blackburn 2008), ecological niche evolution (e.g., Smith and Beaulieu 2009, Evans et al. 2009) and metabolic pathway evolution (e.g., Gabaldón 2003, Christin et al. 2008). Investigations of the factors affecting the accuracy with which ancestral character states can be reconstructed have focused in particular on the choice of statistical framework (Ekman et al. 2008) and the selection of the best model of evolution (Cunningham et al. 1998, Mooers et al. 1999). However, other potential biases affecting these methods, such as the effect of tree shape (Mooers 2004), taxon sampling (Salisbury and Kim 2001) as well as reconstructing traits involved in species diversification (Goldberg and Igić 2008), have also received specific attention. Most of these studies conclude that ancestral character states reconstruction is still not perfect, and that further developments are necessary to improve its accuracy (e.g., Christin et al. 2010). Here, we examine how different estimations of branch lengths affect the accuracy of ancestral character state reconstruction. In particular, we tested the effect of using time-calibrated versus molecular branch lengths and provide guidelines to select the most appropriate branch lengths to reconstruct the ancestral state of a trait.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 33
  • 10.1186/s12862-015-0471-5
Multiple independent origins of auto-pollination in tropical orchids (Bulbophyllum) in light of the hypothesis of selfing as an evolutionary dead end.
  • Sep 16, 2015
  • BMC Evolutionary Biology
  • Alexander Gamisch + 2 more

BackgroundThe transition from outcrossing to selfing has long been portrayed as an ‘evolutionary dead end’ because, first, reversals are unlikely and, second, selfing lineages suffer from higher rates of extinction owing to a reduced potential for adaptation and the accumulation of deleterious mutations. We tested these two predictions in a clade of Madagascan Bulbophyllum orchids (30 spp.), including eight species where auto-pollinating morphs (i.e., selfers, without a ‘rostellum’) co-exist with their pollinator-dependent conspecifics (i.e., outcrossers, possessing a rostellum). Specifically, we addressed this issue on the basis of a time-calibrated phylogeny by means of ancestral character reconstructions and within the state-dependent evolution framework of BiSSE (Binary State Speciation and Extinction), which allowed jointly estimating rates of transition, speciation, and extinction between outcrossing and selfing.ResultsThe eight species capable of selfing occurred in scattered positions across the phylogeny, with two likely originating in the Pliocene (ca. 4.4–3.1 Ma), one in the Early Pleistocene (ca. 2.4 Ma), and five since the mid-Pleistocene (ca. ≤ 1.3 Ma). We infer that this scattered phylogenetic distribution of selfing is best described by models including up to eight independent outcrossing-to-selfing transitions and very low rates of speciation (and either moderate or zero rates of extinction) associated with selfing.ConclusionsThe frequent and irreversible outcrossing-to-selfing transitions in Madagascan Bulbophyllum are clearly congruent with the first prediction of the dead end hypothesis. The inability of our study to conclusively reject or support the likewise predicted higher extinction rate in selfing lineages might be explained by a combination of methodological limitations (low statistical power of our BiSSE approach to reliably estimate extinction in small-sized trees) and evolutionary processes (insufficient time elapsed for selfers to go extinct). We suggest that, in these tropical orchids, a simple genetic basis of selfing (via loss of the ‘rostellum’) is needed to explain the strikingly recurrent transitions to selfing, perhaps reflecting rapid response to parallel and novel selective environments over Late Quaternary (≤ 1.3 Ma) time scales.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-015-0471-5) contains supplementary material, which is available to authorized users.

  • Research Article
  • Cite Count Icon 989
  • 10.1080/106351599260184
The Maximum Likelihood Approach to Reconstructing Ancestral Character States of Discrete Characters on Phylogenies
  • Jul 1, 1999
  • Systematic Biology
  • Mark Pagel

A phylogeny describes the hierarchical pattern of descent of some group of species from a common ancestor. If information is available on the character states of the contemporary species, thepossibility is raised of using that information in combination with the phylogeny to reconstruct the historical events of evolution. These reconstructions can be used to retrieve a picture of theworld as the species evolved alongwhatwould become the branches of the phylogeny. This, in turn, provides a way to test hypotheses about evolution and adaptation. Methods based on the principle of parsimony reconstruct the ancestral character states to minimize the number of historical character changes required to produce the diversity observed among the contemporary species (seeMaddison et al., 1984, for a general account). An alternative to parsimony approaches makes use of the principle of maximum likelihood. Maximum likelihood solutions make the observed data most likely given somemodel of the process under investigation (see Edwards, 1972). In a phylogenetic context this means reconstructing the ancestral character states to make the character states observed among the contemporary species most probable, given some statistical model of the way evolution proceeds. Maximum likelihood solutions may or may not be the mostparsimonious solution. I restrict myself here to using maximum likelihood models to infer ancestral character states for binary discrete characters, that is, for characters that can adopt only two states, although the generalization to more than two states requires no new concepts.My approach to reconstructing ancestral states makes use of a Markov model of binary character evolution on phylogenies (Pagel, 1994). Sanderson (1993) describes a related model for investigating rates of gains and losses of characters for which the ancestral states are assumed to be known. Schluter (1995), Yang et al. (1995), and Koshi and Goldstein (1996) derive methods that are similar to the procedures I will describe here. However, Yang et al. (1995) and Koshi and Goldstein (1996) use what I shall term “global” methods for estimating ancestral characters, I argue for a “local” approach on grounds that the global method does not produce a maximum-likelihood estimate of the hypothesis of interest. Schluter (1995) reported global and local estimators in his investigation of artiodactyl ribonucleases, and Schluter et al. (1997) reported global estimators. In several recent papers, Schluter (1995; Schluter et al., 1997) called attention to the usefulness of reconstructing ancestral character states for testing ideas about adaptation and evolution, and much of what I say here owes its inspiration to these investigations. Mooers and Schluter (1999) now provide important additional examples of how maximum likelihood methods can return both more information about ancestral character states thanparsimony approaches, as well as information that is at odds with parsimony reconstructions. I intend this article to act as a primer to thosewhoare interested in usingmaximumlikelihood methods but who may not be familiar with the mathematics of the approach. Accordingly, I begin with the simplest case of estimating the ancestral state of two species.

  • Research Article
  • Cite Count Icon 87
  • 10.1002/jez.b.22614
Ancestral state reconstructions require biological evidence to test evolutionary hypotheses: A case study examining the evolution of reproductive mode in squamate reptiles.
  • Mar 2, 2015
  • Journal of Experimental Zoology Part B: Molecular and Developmental Evolution
  • Oliver W Griffith + 5 more

To understand evolutionary transformations it is necessary to identify the character states of extinct ancestors. Ancestral character state reconstruction is inherently difficult because it requires an accurate phylogeny, character state data, and a statistical model of transition rates and is fundamentally constrained by missing data such as extinct taxa. We argue that model based ancestral character state reconstruction should be used to generate hypotheses but should not be considered an analytical endpoint. Using the evolution of viviparity and reversals to oviparity in squamates as a case study, we show how anatomical, physiological, and ecological data can be used to evaluate hypotheses about evolutionary transitions. The evolution of squamate viviparity requires changes to the timing of reproductive events and the successive loss of features responsible for building an eggshell. A reversal to oviparity requires that those lost traits re-evolve. We argue that the re-evolution of oviparity is inherently more difficult than the reverse. We outline how the inviability of intermediate phenotypes might present physiological barriers to reversals from viviparity to oviparity. Finally, we show that ecological data supports an oviparous ancestral state for squamates and multiple transitions to viviparity. In summary, we conclude that the first squamates were oviparous, that frequent transitions to viviparity have occurred, and that reversals to oviparity in viviparous lineages either have not occurred or are exceedingly rare. As this evidence supports conclusions that differ from previous ancestral state reconstructions, our paper highlights the importance of incorporating biological evidence to evaluate model-generated hypotheses.

  • Research Article
  • Cite Count Icon 25
  • 10.1111/jeb.13004
Rate heterogeneity across Squamata, misleading ancestral state reconstruction and the importance of proper null model specification.
  • Nov 17, 2016
  • Journal of Evolutionary Biology
  • S Harrington + 1 more

The binary-state speciation and extinction (BiSSE) model has been used in many instances to identify state-dependent diversification and reconstruct ancestral states. However, recent studies have shown that the standard procedure of comparing the fit of the BiSSE model to constant-rate birth-death models often inappropriately favours the BiSSE model when diversification rates vary in a state-independent fashion. The newly developed HiSSE model enables researchers to identify state-dependent diversification rates while accounting for state-independent diversification at the same time. The HiSSE model also allows researchers to test state-dependent models against appropriate state-independent null models that have the same number of parameters as the state-dependent models being tested. We reanalyse two data sets that originally used BiSSE to reconstruct ancestral states within squamate reptiles and reached surprising conclusions regarding the evolution of toepads within Gekkota and viviparity across Squamata. We used this new method to demonstrate that there are many shifts in diversification rates across squamates. We then fit various HiSSE submodels and null models to the state and phylogenetic data and reconstructed states under these models. We found that there is no single, consistent signal for state-dependent diversification associated with toepads in gekkotans or viviparity across all squamates. Our reconstructions show limited support for the recently proposed hypotheses that toepads evolved multiple times independently in Gekkota and that transitions from viviparity to oviparity are common in Squamata. Our results highlight the importance of considering an adequate pool of models and null models when estimating diversification rate parameters and reconstructing ancestral states.

  • Research Article
  • Cite Count Icon 34
  • 10.1093/sysbio/syq055
Rate Heterogeneity, Ancestral Character State Reconstruction, and the Evolution of Limb Morphology in Lerista (Scincidae, Squamata)
  • Oct 1, 2010
  • Systematic Biology
  • Adam Skinner

Rates of phenotypic evolution derive from numerous interrelated processes acting at varying spatial and temporal scales and frequently differ substantially among lineages. Although current models employed in reconstructing ancestral character states permit independent rates for distinct types of transition (forward and reverse transitions and transitions between different states), these rates are typically assumed to be identical for all branches in a phylogeny. In this paper, I present a general model of character evolution enabling rate heterogeneity among branches. This model is employed in assessing the extent to which the assumption of uniform transition rates affects reconstructions of ancestral limb morphology in the scincid lizard clade Lerista and, accordingly, the potential for rate variability to mislead inferences of evolutionary patterns. Permitting rate variation among branches significantly improves model fit for both the manus and the pes. A constrained model in which the rate of digit acquisition is assumed to be effectively zero is strongly supported in each case; when compared with a model assuming unconstrained transition rates, this model provides a substantially better fit for the manus and a nearly identical fit for the pes. Ancestral states reconstructed assuming the constrained model imply patterns of limb evolution differing significantly from those implied by reconstructions for uniform-rate models, particularly for the pes; whereas ancestral states for the uniform-rate models consistently entail the reacquisition of pedal digits, those for the model incorporating among-lineage rate heterogeneity imply repeated, unreversed digit loss. These results indicate that the assumption of identical transition rates for all branches in a phylogeny may be inappropriate in modeling the evolution of phenotypic traits and emphasize the need for careful evaluation of phylogenetic tests of Dollo's law.

  • Research Article
  • Cite Count Icon 62
  • 10.1111/j.1558-5646.2011.01378.x
LOSS OF SEXUAL RECOMBINATION AND SEGREGATION IS ASSOCIATED WITH INCREASED DIVERSIFICATION IN EVENING PRIMROSES
  • Jul 12, 2011
  • Evolution
  • Marc T J Johnson + 4 more

The loss of sexual recombination and segregation in asexual organisms has been portrayed as an irreversible process that commits asexually reproducing lineages to reduced diversification. We test this hypothesis by estimating rates of speciation, extinction, and transition between sexuality and functional asexuality in the evening primroses. Specifically, we estimate these rates using the recently developed BiSSE (Binary State Speciation and Extinction) phylogenetic comparative method, which employs maximum likelihood and Bayesian techniques. We infer that net diversification rates (speciation minus extinction) in functionally asexual evening primrose lineages are roughly eight times faster than diversification rates in sexual lineages, largely due to higher speciation rates in asexual lineages. We further reject the hypothesis that a loss of recombination and segregation is irreversible because the transition rate from functional asexuality to sexuality is significantly greater than zero and in fact exceeded the reverse rate. These results provide the first empirical evidence in support of the alternative theoretical prediction that asexual populations should instead diversify more rapidly than sexual populations because they are free from the homogenizing effects of sexual recombination and segregation. Although asexual reproduction may often constrain adaptive evolution, our results show that the loss of recombination and segregation need not be an evolutionary dead end in terms of diversification of lineages.

  • Research Article
  • Cite Count Icon 34
  • 10.1016/j.ympev.2006.10.018
Systematics and morphological evolution within the moss family Bryaceae: A comparison between parsimony and Bayesian methods for reconstruction of ancestral character states
  • Oct 27, 2006
  • Molecular Phylogenetics and Evolution
  • Niklas Pedersen + 2 more

Systematics and morphological evolution within the moss family Bryaceae: A comparison between parsimony and Bayesian methods for reconstruction of ancestral character states

  • Research Article
  • Cite Count Icon 63
  • 10.3732/ajb.1300423
Repeated evolution of tricellular (and bicellular) pollen
  • Apr 1, 2014
  • American Journal of Botany
  • Joseph H Williams + 2 more

Male gametophytes of seed plants are sexually immature at the time they are dispersed as pollen, but approximately 30% of flowering plants have tricellular pollen containing fully formed sperm at anthesis. The classic study of Brewbaker (1967: American Journal of Botany 54: 1069-1083) provided a powerful confirmation of the long-standing hypothesis that tricellular pollen had many parallel and irreversible origins within angiosperms. We readdressed the main questions of that study with modern comparative phylogenetic methods. We used our own and more recent reports to greatly expand the Brewbaker data set. We modeled trait evolution for 2511 species on a time-calibrated angiosperm phylogeny using (1) Binary State Speciation and Extinction (BiSSE), which accounts for the effect of species diversification rates on character transition rates and, (2) the hidden rates model (HRM), which incorporates variation in transition rates across a phylogeny. Seventy percent of species had bicellular pollen. BiSSE found a 1.9-fold higher bicellular to tricellular transition rate than in the reverse direction, and bicellular lineages had a 1.8-fold higher diversification rate than tricellular lineages. HRM found heterogeneity in evolutionary rates, with bidirectional transition rates in three of four rate classes. The tricellular condition is not irreversible. Pollen cell numbers are maintained at intermediate frequencies because lower net diversification rates of tricellular lineages are counterbalanced by slower state shifts to the bicellular condition. That tricellular lineages diversify slowly and give rise to bicellular lineages slowly reflects a linkage between the evolution of sporophyte lifestyles and the developmental lability of male gametophytes.

  • Research Article
  • Cite Count Icon 534
  • 10.1093/sysbio/syw022
Detecting Hidden Diversification Shifts in Models of Trait-Dependent Speciation and Extinction.
  • Mar 25, 2016
  • Systematic Biology
  • Jeremy M Beaulieu + 1 more

The distribution of diversity can vary considerably from clade to clade. Attempts to understand these patterns often employ state-dependent speciation and extinction models to determine whether the evolution of a particular novel trait has increased speciation rates and/or decreased extinction rates. It is still unclear, however, whether these models are uncovering important drivers of diversification, or whether they are simply pointing to more complex patterns involving many unmeasured and co-distributed factors. Here we describe an extension to the popular state-dependent speciation and extinction models that specifically accounts for the presence of unmeasured factors that could impact diversification rates estimated for the states of any observed trait, addressing at least one major criticism of BiSSE (Binary State Speciation and Extinction) methods. Specifically, our model, which we refer to as HiSSE (Hidden State Speciation and Extinction), assumes that related to each observed state in the model are "hidden" states that exhibit potentially distinct diversification dynamics and transition rates than the observed states in isolation. We also demonstrate how our model can be used as character-independent diversification models that allow for a complex diversification process that is independent of the evolution of a character. Under rigorous simulation tests and when applied to empirical data, we find that HiSSE performs reasonably well, and can at least detect net diversification rate differences between observed and hidden states and detect when diversification rate differences do not correlate with the observed states. We discuss the remaining issues with state-dependent speciation and extinction models in general, and the important ways in which HiSSE provides a more nuanced understanding of trait-dependent diversification.

  • Research Article
  • Cite Count Icon 185
  • 10.1080/106351599260238
Some Limitations of Ancestral Character-State Reconstruction When Testing Evolutionary Hypotheses
  • Jul 1, 1999
  • Systematic Biology
  • Clifford W Cunningham

The recent explosion of phylogenetic information has had an impact far beyond the Želd of systematic biology. Workers in a variety of disciplines are now interested in generating and using phylogenies for their groups (or genes) of interest (e.g., Brooks and McLennan, 1991; Harvey et al., 1995; Martins, 1996). In large part, this new interest has been driven by the promise of using phylogenies to reconstruct ancestral character states, usually by parsimony. The simplicity of testing evolutionary hypotheses by mapping characters ontoaphylogenyhas an appeal that has not been lost on the biological community. Despite its appeal, ancestral state reconstruction is an ambitious exercise. Although we are very familiar with the difŽculty of accurately inferring phylogenies with thousands of characters (reviewed by Swofford et al., 1996), the challenges associated with reconstructing ancestral character states of individual characters are correspondingly more difŽcult (Swofford and Maddison, 1987, 1992; Maddison and Maddison, 1992; Collins et al., 1994a; Frumhoff and Reeve, 1994; Pagel, 1994; Maddison, 1995; Schluter, 1995; Schultz et al., 1996; Omland, 1997; Schluter et al., 1997; Cunningham et al., 1998). Whereas we generally assume that data used for phylogenetic reconstruction are selectively neutral, many—if not most—of the characters we reconstruct to test evolutionary hypotheses are thought to be under selection (Brooks and McLennan, 1991; Harvey and Pagel, 1991). In fact, their presumed selective importance is often why we are interested in them in the Žrst place. Therefore, the problems of convergence and parallel evolution that plague phylogenetic inference should be that much more serious when we test evolutionary hypotheses with ancestral state reconstructions. In this paper, I discuss the limitations of using reconstructed ancestral character states to test evolutionary hypotheses. In particular, I argue that hypotheses of irreversible evolution (e.g., Dollo’s law; Dollo, 1893) are particularly difŽcult to test by using ancestral character state reconstruction. I illustrate this point with two case studies of life history evolution in echinoderms.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon
Setting-up Chat
Loading Interface