Can the online crowd match real expert judgments? How task complexity and coder location affect the validity of crowd‐coded data

Alexander Horn

doi:10.1111/1475-6765.12278

Abstract

AbstractCrowd‐coding is a novel technique that allows for fast, affordable and reproducible online categorisation of large numbers of statements. It combines judgements by multiple, paid, non‐expert coders to avoid miscoding(s). It has been argued that crowd‐coding could replace expert judgements, using the coding of political texts as an example in which both strategies produce similar results. Since crowd‐coding yields the potential to extend the replication standard to data production and to ‘scale’ coding schemes based on a modest number of carefully devised test questions and answers, it is important that its possibilities and limitations are better understood. While previous results for low complexity coding tasks are encouraging, this study assesses whether and under what conditions simple and complex coding tasks can be outsourced to the crowd without sacrificing content validity in return for scalability. The simple task is to decide whether a party statement counts as positive reference to a concept – in this case: equality. The complex task is to distinguish between five concepts of equality. To account for the crowd‐coder's contextual knowledge, the IP restrictions are varied. The basis for comparisons is 1,404 party statements, coded by experts and the crowd (resulting in 30,000 online judgements). Comparisons of the expert‐crowd match at the level of statements and party manifestos show that the results are substantively similar even for the complex task, suggesting that complex category schemes can be scaled via crowd‐coding. The match is only slightly higher when IP restrictions are used as an approximation of coder expertise.

Full Text