Abstract

Annotated data is prerequisite for many NLP applications. Acquiring large-scale annotated corpora is a major bottleneck, requiring significant time and resources. Recent work has proposed turning annotation into a game to increase its appeal and lower its cost; however, current games are largely text-based and closely resemble traditional annotation tasks. We propose a new linguistic annotation paradigm that produces annotations from playing graphical video games. The effectiveness of this design is demonstrated using two video games: one to create a mapping from WordNet senses to images, and a second game that performs Word Sense Disambiguation. Both games produce accurate results. The first game yields annotation quality equal to that of experts and a cost reduction of 73% over equivalent crowdsourcing; the second game provides a 16.3% improvement in accuracy over current state-of-the-art sense disambiguation games with WordNet.

Highlights

  • Most of Natural Language Processing (NLP) depends on annotated examples, either for training systems or for evaluating their quality

  • Top-rated image for nouns, verbs, and adjectives 57.4%, 53.1%, and 56.2% of the time, respectively. This preference is not significant at p < 0.05, indicating that the top-ranked images produced through Puzzle Racer game play are approximately equivalent in quality to images manually chosen by experts with full knowledge of the sense inventory

  • Puzzle Racer, we demonstrated that game play can produce a high-quality library of images associated with WordNet senses, equivalent to those produced by expert annotators

Read more

Summary

Introduction

Most of Natural Language Processing (NLP) depends on annotated examples, either for training systems or for evaluating their quality. Annotations are created by linguistic experts or trained annotators. Such effort is often very time- and cost-intensive, and as a result creating large-scale annotated datasets remains a longstanding bottleneck for many areas of NLP. As an alternative to requiring expert-based annotations, many studies used untrained, online workers, commonly known as crowdsourcing. When successful, crowdsourcing enables gathering annotations at scale; its performance is still limited by (1) the difficulty of expressing the annotation task as a -understood task suitable for the layman, (2) the cost of collecting many annotations, and (3) the tediousness of the task, which can fail to attract workers. Turning an annotation task into a Game with a Purpose (GWAP) has been shown to lead to better quality results and higher worker engagement (Lee et al, 2013), thanks to the annotators being stimulated by the playful component. Because games may appeal to a different group of people than crowdsourcing, they provide a complementary channel for attracting new annotators

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call