Abstract

We develop factor copula models to analyse the dependence among mixed continuous and discrete responses. Factor copula models are canonical vine copulas that involve both observed and latent variables, hence they allow tail, asymmetric and nonlinear dependence. They can be explained as conditional independence models with latent variables that do not necessarily have an additive latent structure. We focus on important issues of interest to the social data analyst, such as model selection and goodness of fit. Our general methodology is demonstrated with an extensive simulation study and illustrated by reanalysing three mixed response data sets. Our studies suggest that there can be a substantial improvement over the standard factor model for mixed data and make the argument for moving to factor copula models.

Highlights

  • It is very common in social science to deal with data sets that have mixed continuous and discrete responses

  • There are two approaches for modelling multivariate mixed data with latent variables: the underlying variable approach that treats all variables as continuous by assuming the discrete responses are a manifestation of underlying continuous variables that usually follow the normal distribution (e.g., Lee, Poon, & Bentler, 1992; Muthen, 1984; Quinn, 2004); and the response function approach that postulates distributions on the observed variables conditional on the latent variables usually being from the exponential family (e.g., Huber, Ronchetti, & Victoria-Feser, 2004; Moustaki, 1996; Moustaki & Knott, 2000; Moustaki & Victoria-Feser, 2006; Wedel & Kamakura, 2001)

  • It is the most general factor model as (a) it has the standard factor model with an additive latent structure as a special case when the bivariate normal (BVN) copulas are used, (b) it can have a latent structure that is not additive if other than BVN copulas are called, (c) the parameters of the univariate distributions are separated from the copula parameters which are interpretable as dependence of an observed variable with a latent variable, or conditional dependence of an observed variable with a latent variable given preceding latent variables

Read more

Summary

Introduction

It is very common in social science (e.g., in surveys) to deal with data sets that have mixed continuous and discrete responses. ML estimation is feasible, especially when the number of latent variables is small Both approaches are restricted to the MVN assumption for the observed or latent variables, which is not valid in the realistic scenario of tail asymmetry or tail dependence existing in the mixed data. Factor copulas are vine copula models that involve both observed and latent variables They are highly flexible through their specification from bivariate parametric copulas with different tail dependence or asymmetry properties. Factor copula models are more interpretable and fit better than vine copula models, when dependence can be explained through latent variables They are closed under margins, that is, lower-order marginals belong to the same parametric family of copulas and a different permutation of the observed variables has exactly the same distribution.

The factor copula model for mixed responses
Estimation
Copula modelling
Model selection
Vuong’s test for parametric model comparison
Applications
Simulations
Discussion
Conflicts of interest
Data availability statement
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call