Abstract

Finite mixture of regressions (FMR) are commonly used to model heterogeneous effects of covariates on a response variable in settings where there are unknown underlying subpopulations. FMRs, however, cannot accommodate situations where covariates' effects also vary according to an "index" variable-known as finite mixture of varying coefficient regression (FM-VCR). Although complex, this situation occurs in real data applications: the osteocalcin (OCN) data analyzed in this manuscript presents a heterogeneous relationship where the effect of a genetic variant on OCN in each hidden subpopulation varies over time. Oftentimes, the number of covariates with varying coefficients also presents a challenge: in the OCN study, genetic variants on the same chromosome are considered jointly. The relative proportions of hidden subpopulations may also change over time. Nevertheless, existing methods cannot provide suitable solutions for accommodating all these features in real data applications. To fill this gap, we develop statistical methodologies based on regularized local-kernel likelihood for simultaneous parameter estimation and variable selection in sparse FM-VCR models. We study large-sample properties of the proposed methods. We then carry out a simulation study to evaluate the performance of various penalties adopted for our regularized approach and ascertain the ability of a BIC-type criterion for estimating the number of subpopulations. Finally, we applied the FM-VCR model to analyze the OCN data and identified several covariates, including genetic variants, that have age-dependent effects onOCN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call