Abstract
Understanding how useful any particular set of event data might be for conflict research requires appropriate methods for assessing validity when ground truth data about the population of interest do not exist. We argue that a total error framework can provide better leverage on these critical questions than previous methods have been able to deliver. We first define a total event data error approach for identifying 19 types of error that can affect the validity of event data. We then address the challenge of applying a total error framework when authoritative ground truth about the actual distribution of relevant events is lacking. We argue that carefully constructed gold standard datasets can effectively benchmark validity problems even in the absence of ground truth data about event populations. To illustrate the limitations of conventional strategies for validating event data, we present a case study of Boko Haram activity in Nigeria over a 3-month offensive in 2015 that compares events generated by six prominent event extraction pipelines—ACLED, SCAD, ICEWS, GDELT, PETRARCH, and the Cline Center’s SPEED project. We conclude that conventional ways of assessing validity in event data using only published datasets offer little insight into potential sources of error or bias. Finally, we illustrate the benefits of validating event data using a total error approach by showing how the gold standard approach used to validate SPEED data offers a clear and robust method for detecting and evaluating the severity of temporal errors in event data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.