The analysis of clustered binary data is a common task in many areas of application. Parametric approaches to the analysis of such data are numerous, but there has been much recent interest in nonparametric and semiparametric approaches. When cluster sizes are unequal, an assumption is often made of compatibility of marginal distributions in order for semiparametric approaches to be developed when there is little replication for different cluster sizes. Here, we use the marginal compatibility assumption to extend flexible semiparametric Bayesian methods able to shrink towards a “parametric backbone” to the situation where there are few replicated observations for distinct cluster sizes and each distinct value of a covariate. A motivating application is the analysis of developmental toxicology data where pregnant laboratory animals are exposed to a dose of some potentially toxic compound and interest lies in describing the distribution, as a function of the dose level, of the number of fetuses exhibiting some characteristic abnormality. Flexible semiparametric methods are required here, as the data typically exhibit overdispersion and complex structure. We also consider a further extension appropriate to the analysis of clustered binary data in the situation where there is little or no replication for distinct covariate values.
Read full abstract