Abstract

SUMMARY This paper is concerned with a method of estimating the number of species present in a population by considering the number of species obtained in a sample. A Bayesian approach is used, and results are based on a zero-truncated negative binomial prior distribution for the number of species in the population. Results for the posterior model are compared with the prior information, and guidelines are suggested in the choice of values for the parameters. The estimation of the number of species in a population or the number of species unrepresented by a sample is an inferential problem that occurs frequently in the biological sciences. Much previous work has stemmed from a parametric model due to Fisher, Corbet and Williams (1943), where Fisher fitted a theoretical distribution to Williams' data on macrolepidoptera. Also, Good (1953) and, later, Good and Toulmin (1956) derived a model based on a set of statistical hypotheses concerning the population frequencies of the individual species. However, the estimation of the total number of species in the population was not the prime concern of any of the work described above. When such estimation has been the main aim, it has usually been assumed that the species have equal relative abundances, as in, for example, the maximum likelihood approach of Driml and Ullrich (1967). Clearly, this assumption will not be appropriate in many practical situations. More recently, nonequiprobable models have been derived by Efron and Thisted (1976) and Hill (1979). Efron and Thisted obtained a method of estimating the number of unseen species by an Euler transformation and a linear programming approach. They took the number of unseen species to be the number of words Shakespeare knew but did not use in print. Hill provided a model that enables an estimate of the number of species in the population to be made by imposing a zero-truncated negative binomial prior distribution for the number of species. However, Hill's final analysis is limited to a uniform prior distribution for the number of species; it is stated that this gives a sufficiently close approximation. In the present paper, it is proposed both to derive and to analyse a model based on the zero-truncated negative binomial prior distribution for the number of species. A similar generalization of Hill's model for Zipf's law was given by Chen (1980). The form of joint prior density for the unknown species relative abundances given by Equation (2.1) is derived under the assumptions used by Kempton and Wedderburn (1978). They explained that when the frequency distribution of the species abundances can be approximated by a continuous distribution, then in practice the gamma distribution provides a reasonable fit. Hence, the abundances of species in the population constitute independent observations from a gamma distribution. The relative abundances are then beta variates, and their joint density function is as shown in (2.1).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call