Individual vs. Collaborative Methods of Crowdsourced Transcription

Samantha Blickhan,Amy Boyer,Coleman Krawczyk,Victoria Van Hyning,Andrea Simenstad,Daniel Hanson

doi:10.46298/jdmdh.5759

Abstract

While online crowdsourced text transcription projects have proliferated in the last decade, there is a need within the broader field to understand differences in project outcomes as they relate to task design, as well as to experiment with different models of online crowdsourced transcription that have not yet been explored. The experiment discussed in this paper involves the evaluation of newly-built tools on the Zooniverse.org crowdsourcing platform, attempting to answer the research question: "Does the current Zooniverse methodology of multiple independent transcribers and aggregation of results render higher-quality outcomes than allowing volunteers to see previous transcriptions and/or markings by other users? How does each methodology impact the quality and depth of analysis and participation?" To answer these questions, the Zooniverse team ran an A/B experiment on the project Anti-Slavery Manuscripts at the Boston Public Library. This paper will share results of this study, and also describe the process of designing the experiment and the metrics used to evaluate each transcription method. These include the comparison of aggregate transcription results with ground truth data; evaluation of annotation methods; the time it took for volunteers to complete transcribing each dataset; and the level of engagement with other project elements such as posting on the message board or reading supporting documentation. Particular focus will be given to the (at times) competing goals of data quality, efficiency, volunteer engagement, and user retention, all of which are of high importance for projects that focus on data from galleries, libraries, archives and museums. Ultimately, this paper aims to provide a model for impactful, intentional design and study of online crowdsourcing transcription methods, as well as shed light on the associations between project design, methodology and outcomes.

Highlights

In his seminal work The Wisdom of Crowds (2004), James Surowiecki argued that a heterogeneous group of individuals, including experts and non-experts, could more accurately and efficiently make decisions or offer solutions to complex problems than the average expert
The A version of the interface was modeled after existing Zooniverse text transcription projects using individual transcription methods, and the B version consisted of a new method, featuring collaborative transcription
Low click-rates from the header menu suggest that we may need to consider making the information from the About page more obvious within the transcription interface. In this A/B experiment, the cohort of volunteers sorted into the collaborative workflow produced transcription data that had significantly less variation from the gold standard transcription data than the transcription data provided by those sorted into the individual workflow

Summary

INTRODUCTION

In his seminal work The Wisdom of Crowds (2004), James Surowiecki argued that a heterogeneous group of individuals, including experts and non-experts, could more accurately and efficiently make decisions or offer solutions to complex problems than the average expert. Howe’s definition has since been extended and applied to academic research, and scholarly definitions of crowdsourcing range from concise (“the process of leveraging public participation in or contributions to projects and activities,” Hedges & Dunn 2017, 1) to all-encompassing (“a type of participative online activity in which an individual, an institution, a non-profit organization, or a company proposes to a group of individuals [the ‘crowd’] of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task,” Estellés-Arolas & González-Ladrón-de-Guevara 2012, 9-10) The latter recalls Surowiecki’s emphasis on the importance of heterogeneity and independence of decision makers, and most academic definitions include some acknowledgement of crowd diversity emphasizing the importance of non-specialist participation (see Van Hyning 2019, 4-5; for further discussion of the definition of crowdsourcing, see Brabham 2013; Ridge 2014; and Terras 2016). The authors sought to interrogate the role of independent decision-making—one of the fundamental principles of crowdsourcing

TRANSFORMING LIBRARIES AND ARCHIVES THROUGH CROWDSOURCING

EXPERIMENT DESIGN

Annotation Aggregation

Consensus Score

Gold Standard Comparison

Analysis with Outliers Removed

Cohorts by Users and Sessions

Cohorts by Classification Numbers

Collaborative Cohort Behavior

Engagement by Cohort

CONCLUSION

Findings

Next Steps

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Data Mining and Digital Humanities	Publication Date: Dec 3, 2019
Citations: 7	License type: cc-by

R Discovery Prime

R Discovery Prime

Individual vs. Collaborative Methods of Crowdsourced Transcription

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Data Mining and Digital Humanities

Lead the way for us

Similar Papers

Communicating our way to engaged volunteers: A mediated process model of volunteer communication, engagement, and commitment.
Sheridan B Trent ... Kelly A Prange
Journal of Community Psychology | VOL. 48
Sheridan B Trent, et. al.Sheridan B Trent ... Kelly A Prange
25 Aug 2020
Journal of Community Psychology | VOL. 48

Emotions Matter: Understanding the Relationship Between Drivers of Volunteering and Participation
Qing Miao ... Hui Yin
Journal of Social Service Research | VOL. ahead-of-print
Qing Miao, et. al.Qing Miao ... Hui Yin
26 Apr 2024
Journal of Social Service Research | VOL. ahead-of-print

High-Performance Human Resource Practices and Volunteer Engagement: The Role of Empowerment and Organizational Identification
Charlotte Traeger ... Kerstin Alfes
Voluntas: International Journal of Voluntary and Nonprofit Organizations | VOL. 30
Charlotte Traeger, et. al.Charlotte Traeger ... Kerstin Alfes
13 Jun 2019
Voluntas: International Journal of Voluntary and Nonprofit Organizations | VOL. 30

Measuring the Impact of Crowdsourcing Features on Mobile App User Engagement and Retention: A Randomized Field Experiment
Zhuojun Gu ... Jason Chan
SSRN | VOL. -
Zhuojun Gu, et. al.Zhuojun Gu ... Jason Chan
24 Oct 2017
SSRN | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Individual vs. Collaborative Methods of Crowdsourced Transcription

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Data Mining and Digital Humanities