Abstract

The relative merits of corpus and native speaker judgment data is a topic of long-standing debate in linguistics (Labov 1972; Fillmore 1992, inter alia). In this paper, we approach the question from the perspective of grammar engineering, and argue that (unsurprisingly to some, cf. Fillmore) these sources of data are best treated as complementary to one another. Further, we argue that encoding native speaker intuitions in a broad-coverage precision implemented grammar and then using the grammar to process a corpus is an effective way to explore the interaction between the two sources of data, while illuminating both. We discuss how the corpus can be used to constructively road-test such a grammar and ultimately extend its coverage. We also examine limitations in fully corpus-driven grammar development, and motivate the continued use of judgment data throughout the evolution of a precision grammar. Our use of corpus data is limited to evaluating the grammar and exposing gaps in its lexical and constructional coverage, where actual grammar development is based on the combination of corpus and judgment data. In

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call