Abstract

While online crowdsourced text transcription projects have proliferated in the last decade, there is a need within the broader field to understand differences in project outcomes as they relate to task design, as well as to experiment with different models of online crowdsourced transcription that have not yet been explored. The experiment discussed in this paper involves the evaluation of newly-built tools on the Zooniverse.org crowdsourcing platform, attempting to answer the research question: "Does the current Zooniverse methodology of multiple independent transcribers and aggregation of results render higher-quality outcomes than allowing volunteers to see previous transcriptions and/or markings by other users? How does each methodology impact the quality and depth of analysis and participation?" To answer these questions, the Zooniverse team ran an A/B experiment on the project Anti-Slavery Manuscripts at the Boston Public Library. This paper will share results of this study, and also describe the process of designing the experiment and the metrics used to evaluate each transcription method. These include the comparison of aggregate transcription results with ground truth data; evaluation of annotation methods; the time it took for volunteers to complete transcribing each dataset; and the level of engagement with other project elements such as posting on the message board or reading supporting documentation. Particular focus will be given to the (at times) competing goals of data quality, efficiency, volunteer engagement, and user retention, all of which are of high importance for projects that focus on data from galleries, libraries, archives and museums. Ultimately, this paper aims to provide a model for impactful, intentional design and study of online crowdsourcing transcription methods, as well as shed light on the associations between project design, methodology and outcomes.

Highlights

  • In his seminal work The Wisdom of Crowds (2004), James Surowiecki argued that a heterogeneous group of individuals, including experts and non-experts, could more accurately and efficiently make decisions or offer solutions to complex problems than the average expert

  • The A version of the interface was modeled after existing Zooniverse text transcription projects using individual transcription methods, and the B version consisted of a new method, featuring collaborative transcription

  • Low click-rates from the header menu suggest that we may need to consider making the information from the About page more obvious within the transcription interface. In this A/B experiment, the cohort of volunteers sorted into the collaborative workflow produced transcription data that had significantly less variation from the gold standard transcription data than the transcription data provided by those sorted into the individual workflow

Read more

Summary

INTRODUCTION

In his seminal work The Wisdom of Crowds (2004), James Surowiecki argued that a heterogeneous group of individuals, including experts and non-experts, could more accurately and efficiently make decisions or offer solutions to complex problems than the average expert. Howe’s definition has since been extended and applied to academic research, and scholarly definitions of crowdsourcing range from concise (“the process of leveraging public participation in or contributions to projects and activities,” Hedges & Dunn 2017, 1) to all-encompassing (“a type of participative online activity in which an individual, an institution, a non-profit organization, or a company proposes to a group of individuals [the ‘crowd’] of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task,” Estellés-Arolas & González-Ladrón-de-Guevara 2012, 9-10) The latter recalls Surowiecki’s emphasis on the importance of heterogeneity and independence of decision makers, and most academic definitions include some acknowledgement of crowd diversity emphasizing the importance of non-specialist participation (see Van Hyning 2019, 4-5; for further discussion of the definition of crowdsourcing, see Brabham 2013; Ridge 2014; and Terras 2016). The authors sought to interrogate the role of independent decision-making—one of the fundamental principles of crowdsourcing

TRANSFORMING LIBRARIES AND ARCHIVES THROUGH CROWDSOURCING
EXPERIMENT DESIGN
Annotation Aggregation
Consensus Score
Gold Standard Comparison
Analysis with Outliers Removed
Cohorts by Users and Sessions
Cohorts by Classification Numbers
Collaborative Cohort Behavior
Engagement by Cohort
CONCLUSION
Findings
Next Steps
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call