Using annotations on Mechanical Turk to perform supervised polarity classification of Spanish customer comments

Marta R Costa-Jussà,Jens Grivolla,Bart Mellebeek,Francesc Benavent,Joan Codina,Rafael E Banchs

doi:10.1016/j.ins.2014.01.043

Marta R Costa-Jussà, Jens Grivolla + Show 4 more

https://doi.org/10.1016/j.ins.2014.01.043

Copy DOI

Abstract

One of the major bottlenecks in the development of data-driven AI Systems is the cost of reliable human annotations. The recent advent of several crowdsourcing platforms such as Amazon’s Mechanical Turk, allowing requesters the access to affordable and rapid results of a global workforce, greatly facilitates the creation of massive training data. Most of the available studies on the effectiveness of crowdsourcing report on English data. We use Mechanical Turk annotations to train an Opinion Mining System to classify Spanish consumer comments. We design three different Human Intelligence Task (HIT) strategies and report high inter-annotator agreement between non-experts and expert annotators. We evaluate the advantages/drawbacks of each HIT design and show that, in our case, the use of non-expert annotations is a viable and cost-effective alternative to expert annotations.

Full Text