Abstract

Species richness (SR) and phylogenetic diversity (PD) are highly correlated measures of plant diversity. Each, by itself, is significantly associated with plant community biomass in biodiversity experiments. As presented by Cadotte (2015) and as we present below, reasonable but alternative analyses that attempt to control for this correlation in different ways provide contradictory or inconclusive support for the hypothesis that PD is superior to SR as a predictor of community biomass. In Venail et al. (2015), we re-analysed data from 16 experimental manipulations of grassland SR to look at how SR and PD influence variation in plant community biomass through time. Using four types of analyses, we showed that, after statistically controlling for variation in SR, PD was not related to community biomass or to the temporal stability of biomass. We did, however, find that SR tends to increase the biomass production of plant communities after controlling for PD. In his comment, Cadotte expressed two concerns about our analyses. One is that we used non-random subsets of experiments, rather than the full data set, for some of our analyses (types 2, 3). We were clear in stating these analyses were based on non-random subsets that were specifically chosen to minimize the SR–PD correlation and avoid problems associated with multicollinearity. We acknowledge that our tests are conservative, a cost of which is that they sacrifice statistical power while, at the same time, minimizing the chance of drawing an incorrect conclusion. But we disagree with Cadotte's suggestion that our use of non-random data subsets led to ‘biased’ conclusions, and demonstrate later in this response that his claim of bias is unsubstantiated. Cadotte's second concern was that our analyses did not account for differences in biomass across studies. This is an important criticism to consider; we made a mistake by not controlling for variation in biomass. To address this issue, Cadotte used mixed models where study was included as a random effect, and ran analyses that standardized biomass among sites. Collectively, these led Cadotte to conclude ‘All analyses strongly support previous literature claims about the value of PD and I further show that: (i) PD provides a more powerful explanation of variation in biomass production than species richness; (ii) PD explains variation in biomass production after controlling for richness; and (iii) the use of data subsets inadvertently biased the conclusions’. We have two concerns with Cadotte's re-analysis. First, Cadotte's approach largely ignores the concerns we raised about multicollinearity. When two or more predictors exhibit a high degree of correlation, each predictor contains little unique information. As a result, it is difficult (if not impossible) to estimate their independent effects using statistical methods like multivariate or partial regression (Dormann et al. 2013). The consequences of multicollinearity include inflated error estimates that can alter conclusions about what predictors are significant or not, as well as unstable parameter estimates that can change in sign and magnitude with minor alterations to analyses (Graham 2003; Zuur, Ieno & Elphick 2010). Multicollinearity is a concern for the data set of Venail et al. (2015) because PD and SR are correlated with r = 0·90. We were concerned about drawing inferences from predictors that have little unique information, which is why we performed analyses that all attempted to hold one of the two predictors constant while examining the impact of the other. In contrast, Cadotte performed model selection using the full data set where the SR–PD correlation was r = 0·90. We remain sceptical of this approach because of the difficulties generating reliable estimates for strongly correlated predictors. A second issue with Cadotte's analyses, which we are guilty of for some analyses in our study, is the assumption that the relationship between biodiversity (PD or SR) and community biomass is linear. Most studies included in the Venail et al. data set have shown that the effect of biodiversity on community biomass is positive, but nonlinear and decelerating. For example, Cardinale et al. (2011) summarized the form of diversity–biomass relationships for 433 experimental manipulations of primary producer richness and concluded ‘Of the studies that have shown a positive effect of producer diversity on producer biomass, 79% were best fit by some form of a positive but decelerating curve (log, power, or M-M functions, Fig. 5A)’. In contrast, only 13% of studies to date are best fit by linear relationships. We reran Cadotte's analyses after accounting for nonlinear relationships and found that most of his conclusions did not hold. Our modified analyses (provided in accompanying R-code) rerun the same analyses of Cadotte, which account for variation in biomass among studies, but using ln-transformed predictors to also account for positive, decelerating relationships. Cadotte's first set of analyses modelled biomass in experimental plots as linear functions of SR and/or PD with study included as a random effect to account for differences in biomass among sites. These produced an AIC of 10 216 and 10 194 for SR and PD, respectively, and an AIC of 10 196 for a model including both SR and PD as predictors. In contrast, the best model in our modified analyses included both ln-transformed SR and PD with an AIC of 10 184. This represents an improved fit to data compared to Cadotte's analyses, and confirms that failure to account for nonlinear relationships led to inferior models. After confirming that relationships between PD, SR and community biomass are better described by nonlinear models, we reran Cadotte's partial regression analyses which found that PD explains a significant fraction of the residual variation in biomass after accounting for effects of SR (F = 4·09, P = 0·04), but SR did not explain residual variation after accounting for effects of PD (F = 0·09, P = 0·77). Using ln-transformed predictors where the SR–PD correlation was lower (r = 0·70), we found that ln(PD) explained 0·05% of the variation unaccounted for by ln(SR) (F = 3·79, P = 0·052, R2 = 0·005). Yet, ln(SR) explained 1·4% of the residual variation in community biomass unaccounted for by ln(PD) (F = 12, P < 0·01, R2 = 0·014). Cadotte also reran our structural equation model (SEM), but used the full data set where the PD–SR correlation was r = 0·90. He accounted for variation among studies by scaling biomass to have a mean = 0 and SD = 1. Cadotte's SEM (reproduced in Fig. 1a) shows that PD explains a significant fraction of variation in scaled biomass and SD through time. In contrast, SR did not explain variation in either. We reran the same SEM on the full data set, but using ln-transformed predictors to account for nonlinear relationships. The modified SEM was a significantly improved fit over the linear version (compare χ2, P-values and AIC for Fig. 1a,b) and led to conclusions that were consistent with those from our original paper (Venail et al. 2015) where we found SR impacts community biomass, but PD does not. In contrast, PD affects the SD of biomass through time, but SR does not. In his final analysis, Cadotte tried to assess whether the five experiments included in our SEM were a ‘biased’ representation of the full set of 16 experiments. He chose 1000 random subsets of five experiments and, for each subset, ran two mixed effects models – one modelling biomass as a function of PD and one modelling biomass as a function of SR. He then calculated the difference in AIC for the two models. If ΔAIC was <0 (>0), this indicated PD (SR) was a better predictor of biomass for that random subset. The frequency distribution of ΔAIC values (Fig. 3 of his comment) is reproduced in Fig. 1c. The mean of this distribution was significantly <0, suggesting PD is a better predictor of biomass than SR in most random subsets of five experiments. In addition, the subset of five experiments used for our SEM was different than the overall distribution, suggesting biased selection. But Cadotte's conclusions about the ‘representativeness’ of the five experiments are overturned when we repeat the same analyses using ln-transformed predictor variables. Indeed, the balance of evidence favoured ln(SR) as the better model (Fig. 1d) with the distribution of ΔAIC values being significantly >0 (mean = +5·64, t = 12·06, P < 0·01). The value of ΔAIC for the subset of five experiments used in our SEM is near the centre of the distribution, indicating it was not a biased subset. So where do we stand in this exchange? Cadotte, Cardinale & Oakley (2008) found that PD was not only a significant predictor of community biomass in grassland biodiversity experiments, it explained ~2% more variation than SR. We (Venail et al. 2015) suggested that synthesis did not control for multicollinearity among predictors. When we (Venail et al. 2015) controlled for multicollinearity (but failed to account for biomass differences among studies), we found PD was not a significant predictor of community biomass or stability, whereas SR was. Cadotte argued in his comment that our new analyses were incorrect because we did not account for variation in biomass among studies, and were biased by our use of data subsets to control for multicollinearity. Cadotte's re-analyses led him to conclude that PD is not only significant, but is again a better predictor of community biomass than SR. We responded by pointing out that multicollinearity continues to be a concern about Cadotte's analyses, and his conclusions do not hold after accounting for nonlinear relationships between biodiversity and ecosystem functioning. Whether using the statistical approaches from our original paper (Venail et al. 2015) or model selection favoured by Cadotte, we are led to two conclusions: (i) either SR or PD can explain most of the variation in community biomass and stability on their own because they share so much information. However, (ii) when we examine their effects after statistically controlling for the other, there is little evidence that PD is a better predictor of ecological function than SR. SR is usually a significant predictor of community biomass and stability after controlling for variation in PD, whereas PD is often (though not always) non-significant after controlling for variation in SR. We would caution against interpreting these results as evidence that PD does not matter for ecosystem functioning. Cadotte is correct that experiments analysed to date have not been explicitly designed to test hypotheses about PD, and therefore, we will need studies that orthogonally manipulate PD and SR to fully resolve their relative importance. On the other hand, given the existing data and analyses, we think it is important that researchers refrain from claiming that phylogenetic diversity is a ‘strong’ predictor of ecosystem functioning, or a ‘better’ predictor than plant richness in grasslands. Such claims are not supported at this time. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call