Abstract

Binary similarity coefficients are used widely in paleontology and biostratigraphy for analysis of multivariate data, especially in conjunction with cluster analysis. Cluster analysis of large geological data sets is employed to convey information about relative distances of grouping within the data. The statistical significance of grouped associations can be determined if a coefficientyields a distribution of values that approximates a binomial distribution. Yet some coefficients currently in use do not yield approximations to binomial distributions (e.g, theJaccard coefficient). This departure from a binomial distribution is most apparent when coefficient values are based upon a small number of variables per sample-a condition that is an unfortunate but common attribute of geological data. Nevertheless, analyses of empirically derived distributions allow comparison of values computed by different coefficients. In decreasing order of utility and faithfulness of data representation for 2 50% 1 s, we rank the coefficients tested as Simple Matching, Hamann, Baroni-Urbani and Buser, Dice, Braun-Blanquet, and Simpson. The Jaccard coefficient should be used with caution because of its peculiarities and its nonbinomial distribution irrespective of the number of variables used. INTRODUCTION

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.