Abstract

One of the major challenges for epidemiologists is to understand causal relationships between risk factors and health outcomes by analysing data from observational studies. Epidemiologists know too well that correlations, such as those in regression analysis, rarely mean causation, and it would be very desirable if there is a methodology for observational studies, analogous to randomization in experimental studies, which can discover causes and effects amongst variables or at least confirm or refute the proposed causal relationships. Randomized controlled trials (RCTs) are certainly the ‘gold standard’ of establishing causes and effects, but quite often it is either unethical or unfeasible to conduct RCTs to test causal relations in epidemiological research. Epidemiologists need a methodology, which is sort of a combination of the directed acyclic graphs (DAGs) for conceptual construction of causal models and regression analysis for testing those models. It is therefore surprising that structural equation modelling (SEM) has not been so frequently used in epidemiology as in the social sciences, given that both epidemiologists and social scientists want to delineate causes and effects from observational data. The difference between DAGs and path diagrams in SEM is almost trivial to epidemiologists: the path between two variables can only have one direction in DAGs. 1,2 An individual path in SEM is tested in the same way the regression coefficient is in regression analysis, and model fit indices provided by SEM software packages help the analysts to assess the adequacy of the proposed causal model compared with the observed associations in the sample data. 3,4 Why then is SEM still under-utilized in epidemiology? This is a question posed by a commentary 5 in this journal a few years ago. The answers cited included unfamiliar terminology (SEM theory is formulated in Greek), restriction in the assumptions of variables (outcome variables need to be continuous), difficulties in testing interaction and non-linear relationship (it can be quite tedious to set up SEM models to do these), and equivalent models (two different causal models imply the same correlation structure and consequently, it is impossible to tell which is better). 6 Recent advances in SEM theory and software development has resolved some of these issues: new estimation methods do not require the strict assumption of multivariate normality, 7 and the outcome variables can be binary, ordinal or counts. 7–9 The basic rationale behind SEM is rather simple: multiple linear equations are used to specify causal relationship between variables some of which are manifest variables (i.e. observed and collected by the researchers), whilst others are latent variables (i.e. derived from the observed variables by specifying their relations using equations), such as those in factor analysis. The multiple equations in each causal model entail a certain structure of correlational relationships between observed variables which is usually given as a correlation (or covariance) matrix . The estimation procedure is to minimize the difference between and the observed correlation/covariance matrix S formulated by a likelihood function. The 2 -test is then used to evaluate the difference between these two matrices by taking into account the number of the estimated parameters in the proposed model. When the 2 -value is large (i.e. the difference between the two matrices is large) relative to the model’s degree of freedom, the proposed model is rejected, i.e. something in the causal relationships specified by the proposed model is not quite right and require a second thought. When the 2 -value is small, we fail to reject the model or tentatively accept the model as adequate. Because of the possibility of equivalent models, our model may still be wrong in terms of the causal relations amongst variables but happen to entail the same correlation structure (the same S) of the ‘true’ model. So SEM seems to be an endeavour to search for the truth by approximation, and in this aspect, doing SEM is quite similar to the process of ‘Conjectures and Refutations’ as described by the great philosopher

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call