Through theory, simulation, and analyses of developmental toxicity data, we evaluate two-stage log-linear negative binomial and overdispersed Poisson models that accommodate random cluster-size inference for clustered binary response data. The issue of random cluster size has especially been of concern for animal developmental toxicity studies of teratogenic chemicals where litter sizes vary. Such an application of these twostage models has not been reported previously but is theoretically justified. This unique application of these models is based upon an extension of a little known result that the marginal distribution for cluster-level counts of binary responses is Poisson under a firststage binomial distribution for such counts and a second-stage Poisson distribution for cluster size. The proposed models are compared with generalized estimating equations (GEE) estimation of a logistic model that treats cluster size as fixed. The simulations and data analyses suggest that both the negative binomial and GEE logistic models consistently lead to correct inference when cluster size is a Poisson random variable, although there are some exceptions when cluster size is underdispersed or when it depends upon a covariate.
Read full abstract