Tests for null hypotheses of 'absence of structure' should play an important role in any exploratory study, to guard against interpretation of sample patterns that could have been obtained by chance, and two new tests of this type are described. In the multivariate analyses that arise in community ecology and many other environmental contexts, e.g. in linking assemblage patterns to forcing environmental variables (gradient analysis), the problem of chance associations is exacerbated by the large number of combinations of abiotic variables that can usually be examined. A test which allows for this selection bias is described (the global BEST test), which applies to any dissimilarity measure, utilises only rank dissimilarities, and operates by permutation, assuming no specific distributional form or parametric expression for the biotic to abiotic links. A second permutation procedure, the similarity profile routine (SIMPROF), tests for the presence of sample groups (or more continuous sample patterns) in a priori unstructured sets of samples, for which an a priori structured test (e.g. the widely-used ANOSIM) is invalid. One context is in interpreting dendrograms from hierarchical cluster analyses: a series of SIMPROF tests provides objective stopping rules for ever-finer dissection into subgroups. Connecting these two tests is a third methodological strand, adapting De'ath's multivariate equivalent of univariate CART analysis (Classification And Regression Trees) to a non-parametric context. This produces a divisive, constrained, hierarchical cluster analysis of samples, based on their assemblage data, termed a linkage tree. The constraint is that each binary division of the tree corresponds to a threshold on one of the environmental variables and, consistently with related non-parametric routines, maximises the high-dimensional separation of the two groups, as measured by the ANOSIM R statistic. Such linkage trees therefore provide abiotic 'explanations' for each biotic subdivision of the samples but, as with unconstrained clustering, the LINKTREE routine requires objective stopping rules to avoid over-interpretation, these again being provided by a sequence of SIMPROF tests. The inter-connectedness of these three new developments is illustrated by data from the literature of marine ecology.
Read full abstract