Individual identification using DNA fingerprinting methods is emerging as a critical tool in conservation genetics and molecular ecology. Statistical methods that estimate the probability of sampling identical genotypes using theoretical equations generally assume random associations between alleles within and among loci. These calculations are probably inaccurate for many animal and plant populations due to population substructure. We evaluated the accuracy of a probability of identity (P(ID)) estimation by comparing the observed and expected P(ID), using large nuclear DNA microsatellite data sets from three endangered species: the grey wolf (Canis lupus), the brown bear (Ursus arctos), and the Australian northern hairy-nosed wombat (Lasiorinyus krefftii). The theoretical estimates of P(ID) were consistently lower than the observed P(ID), and can differ by as much as three orders of magnitude. To help researchers and managers avoid potential problems associated with this bias, we introduce an equation for P(ID) between sibs. This equation provides an estimator that can be used as a conservative upper bound for the probability of observing identical multilocus genotypes between two individuals sampled from a population. We suggest computing the actual observed P(ID) when possible and give general guidelines for the number of codominant and dominant marker loci required to achieve a reasonably low P(ID) (e.g. 0.01-0.0001).
Read full abstract