Abstract

Labelling data is one of the most fundamental activities in science, and has underpinned practice, e.g., in medicine and psychology (Agresti, 2003, Siegel and Castellan, 1988) for decades, but also research in content analysis (Krippendorff, 2004a) and corpus linguistics (McEnery and Wilson, 2019). With the shift in Artificial Intelligence (AI) toward Machine Learning, the creation of datasets/corpora to be used for training and evaluating AI systems has become a central activity in the field as well, including in the area of AI with which we are primarily concerned, Natural Language Processing (NLP) (Ide and Pustejovsky, 2017).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call