Abstract
Expert-coded datasets provide scholars with otherwise unavailable data on important concepts. However, expert coders vary in their reliability and scale perception, potentially resulting in substantial measurement error. These concerns are acute in expert coding of key concepts for peace research. Here I examine (1) the implications of these concerns for applied statistical analyses, and (2) the degree to which different modeling strategies ameliorate them. Specifically, I simulate expert-coded country-year data with different forms of error and then regress civil conflict onset on these data, using five different modeling strategies. Three of these strategies involve regressing conflict onset on point estimate aggregations of the simulated data: the mean and median over expert codings, and the posterior median from a latent variable model. The remaining two strategies incorporate measurement error from the latent variable model into the regression process by using multiple imputation and a structural equation model. Analyses indicate that expert-coded data are relatively robust: across simulations, almost all modeling strategies yield regression results roughly in line with the assumed true relationship between the expert-coded concept and outcome. However, the introduction of measurement error to expert-coded data generally results in attenuation of the estimated relationship between the concept and conflict onset. The level of attenuation varies across modeling strategies: a structural equation model is the most consistently robust estimation technique, while the median over expert codings and multiple imputation are the least robust.
Highlights
Expert-coded datasets provide scholars with otherwise unavailable data on important concepts
I regress conflict onset on the simulated data using five modeling strategies: three strategies utilize common point estimates, while the other two incorporate measurement uncertainty. Results from these analyses indicate that most methods roughly recover the correct relationship between the expert-coded concept and conflict onset, even when expert error is extremely high
The degree to which this attenuation occurs varies across modeling strategies: the most robust strategy is a structural equation model which iteratively estimates concept values and their relationship to conflict onset, while the median and multiple imputation are the least robust
Summary
Expert-coded datasets provide scholars with otherwise unavailable data on important concepts. Results from these analyses indicate that most methods roughly recover the correct relationship between the expert-coded concept and conflict onset, even when expert error is extremely high.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.