Proportion data with support lying in the interval [0,1] are a commonplace in various domains of medicine and public health. When these data are available as clusters, it is important to correctly incorporate the within-cluster correlation to improve the estimation efficiency while conducting regression-based risk evaluation. Furthermore, covariates may exhibit a nonlinear relationship with the (proportion) responses while quantifying disease status. As an alternative to various existing classical methods for modeling proportion data (such as augmented Beta regression) that uses maximum likelihood, or generalized estimating equations, we develop a partially linear additive model based on the quadratic inference function. Relying on quasi-likelihood estimation techniques and polynomial spline approximation for unknown nonparametric functions, we obtain the estimators for both parametric part and nonparametric part of our model and study their large-sample theoretical properties. We illustrate the advantages and usefulness of our proposition over other alternatives via extensive simulation studies, and application to a real dataset from a clinical periodontal study.
Read full abstract