Abstract

In Corpus Linguistics: Readings in a Widening Discipline, Geoffrey Sampson and Diana McCarthy present a diverse yet accessible collection of forty-two previously published papers, which introduce, discuss and exploit the methods of corpus linguistics. The volume can be divided into those papers that investigate language based on the analysis of corpora and those papers that discuss how a corpus should be compiled and analysed. The most straightforward introduction to corpus construction is John Sinclair’s (1987) paper Corpus Creation, which considers the significance of a variety of factors such as medium, corpus size and sample size. A more technical treatment of representativeness in corpus design is found in a 1992 paper by Douglas Biber. Papers that discuss actual corpora include W. Nelson Francis’s 1965 description of the Brown Corpus, the first electronic corpus of English; Burnage and Dunlop’s outline of the compilation and annotation of the 100-million-words British National Corpus; and Tent and Mugler’s 1996 defence of the Corpus of Fijian English, a component of the International Corpus of English. The Editors also include Adam Kilgarriff’s 2001 paper Web as Corpus – an approach to corpus linguistics which will undoubtedly become more common in future research. The majority of papers included in this collection, however, report the results of research based on corpus analyses, and the majority of these papers report the results of grammatical corpus studies. The volume opens with an excerpt from Charles C. Fries’s 1957 The Structure of English – an early empirical description of English grammar, based on a manual analysis of a corpus of recorded telephone conversations. The collection also includes F.G.A.M. Aarts’ 1971 study which demonstrates that English object noun phrases tend to be more complex than subject noun phrases – a result that challenges the assumption, inherent in standard phrase structure rules, that the position of a noun phrase is structurally insignificant. The collection also includes more recent papers which discuss grammatical issues such as treebanks and automatic parsing techniques. Geoffrey Sampson, an editor of the collection, includes an introductory paper, which argues for the importance of treebanks – a corpus where each sentence is associated with a parse tree. The construction of the Penn

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call