BackgroundLatent class models can be used to estimate diagnostic accuracy without a gold standard test. Early studies often assumed independence between tests given the true disease state, however this can lead to biased estimates when there are inter-test dependencies. Residual correlation plots and chi-squared statistics have been commonly utilized to assess the validity of the conditional independence assumption and, when it does not hold, identify which test pairs are conditionally dependent. We aimed to assess the performance of these tools with a simulation study covering a wide range of scenarios.MethodsWe generated data sets from a model with four tests and a dependence between tests 1 and 2 within the diseased group. We varied sample size, prevalence, covariance, sensitivity and specificity, with 504 combinations of these in total, and 1000 data sets for each combination. We fitted the conditional independence model in a Bayesian framework, and reported absolute bias, coverage, and how often the residual correlation plots, G2 and χ2 statistics indicated lack-of-fit globally or for each test pair.ResultsAcross all settings, residual correlation plots, pairwise G2 and χ2 detected the correct correlated pair of tests only 12.1%, 10.3%, and 10.3% of the time, respectively, but incorrectly suggested dependence between tests 3 and 4 64.9%, 49.7%, and 49.5% of the time. We observed some variation in this across parameter settings, with these tools appearing to perform more as intended when tests 3 and 4 were both much more accurate than tests 1 and 2. Residual correlation plots, G2 and χ2 statistics identified a lack of overall fit in 74.3%, 64.5% and 67.5% of models, respectively. The conditional independence model tended to overestimate the sensitivities of the correlated tests (median bias across all scenarios 0.094, 2.5th and 97.5th percentiles -0.003, 0.397) and underestimate prevalence and the specificities of the uncorrelated tests.ConclusionsResidual correlation plots and chi-squared statistics cannot be relied upon to identify which tests are conditionally dependent, and also have relatively low power to detect lack of overall fit. This is important since failure to account for conditional dependence can lead to highly biased parameter estimates.
Read full abstract