Abstract
Genre and domain are well known covariates of both manual and automatic annotation quality. Comparatively less is known about the effect of sentence types, such as imperatives, questions or fragments, and how they interact with text type effects. Using mixed effects models, we evaluate the relative influence of genre and sentence types on automatic and manual annotation quality for three related tasks in English data: POS tagging, dependency parsing and coreference resolution. For the latter task, we also develop a new metric for the evaluation of individual regions of coreference annotation. Our results show that while there are substantial differences between manual and automatic annotation in each task, sentence type is generally more important than genre in predicting errors within our data.
Highlights
With the availability of increasingly diverse language resources and the viability of processing almost unrestricted Web data, domain adaptation and coverage of novel domains have become a major concern in NLP and corpus creation
Accuracy for both state of the art automatic tools and manual annotation of new tasks is typically reported on standard sources, typically newswire text, which often leads to overestimation of expected accuracy in both manual and automatic annotation
It has been suggested that at least part of the source for these problems lies in less frequent kinds of utterances within and across domains, i.e. that domain adaptation may be folding in sentence type effects
Summary
With the availability of increasingly diverse language resources and the viability of processing almost unrestricted Web data, domain adaptation and coverage of novel domains have become a major concern in NLP and corpus creation (see e.g. Daumé 2007, Finkel & Manning 2009, McClosky et al 2010, Søgaard 2013). Accuracy for both state of the art automatic tools and manual annotation of new tasks is typically reported on standard sources, typically newswire text, which often leads to overestimation of expected accuracy in both manual and automatic annotation. In the development of automatic annotation tools, explicit partitioning of sentence types for differential treatment is rare (for an exception see Zhang et al 2008 on machine translation)
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have