Abstract

-Systematists and geneticists often wish to compare two pair-wise or similarity matrices based on different characters. Conventional correlation tests are not appropriate in such situations because of dependencies among the entries in each matrix. A broad class of permutation tests for association between matrices is described. Specific members of the class are discussed and compared using real and simulated data. Statistics resembling Kendall's tau and Spearman's rho are found to have desirable power and invariance properties. [Distance matrices; permutation tests; Mantel test; Spearman's rho; Kendall's tau.] Biologists frequently summarize multivariate data from groups or individuals by computing some measure of generalized or similarity between each two groups or individuals. Distances based on genetic, geographic, morphological, and linguistic traits appear in the literature (e.g., Howells, 1966). The problem then becomes the comparison of two pair-wise or similarity matrices based on different characters for the same groups or individuals. Problems requiring the comparison of two pair-wise matrices also arise in taxonomic studies. Here each may contain cophenetic values (Sokal and Rohlf, 1962) computed from a dendrogram. Two dendrograms to be compared may be based on different suites of characters or produced by different researchers from different sets of data. Some authors (Howells, 1966; McKechnie et al., 1975; Guries and Ledig, 1982) simply calculate and test for significance the sample correlation coefficient (r) between the pairwise distances. The normality assumption required to test the significance of r is suspect for many measures (e.g., see Friedlaender et al., 1971). This difficulty could be overcome by using a distribution-free test for independence. More problematic, however, are dependencies among the pair-wise distances in each which violate the assumptions of normal theory or distribution-free tests. Valid permutation tests for association between matrices have been proposed by authors in other fields (Mantel, 1967; Hubert, 1978a), but have been little used by systematists or geneticists (but see Douglas and Endler, 1982). In this paper, I describe the general strategy for constructing such tests and give examples of specific test statistics, illustrating test properties with real and simulated data. Although I use the term distance matrix throughout the paper, the matrices to be compared may be similarity or dissimilarity matrices. The two matrices must be obtained from different sets of data for the permutation strategy described here to be appropriate, a point discussed by Hubert and Baker (1977), Hubert (1978a), and Sokal (1979). In particular, these permutation tests cannot be used to compare a representing a dendrogram with the original similarity from which the dendrogram was constructed. Neither should the tests be used to compare two dendrograms constructed by different methods from the same data. In these two situations, the hypothesis of independence between the two matrices is clearly untenable. Note that the test statistics described in this paper can be used as descriptive measures of association in these situations; it is the hypothesis-testing procedure that is inappropriate. TESTS FOR ASSOCIATION Permutation Strategy Let n be the number of points (groups or individuals) between which distances have

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call