Abstract
Abstract New Zealand linguists have been involved over the last eight years in planning and collecting data for a number of different written and spoken corpora of New Zealand English. These include the Wellington Corpus of New Zealand English (WCNZE) with its one million word written and one million word spoken components, and the New Zealand contributions to the International Corpus of English (ICE) Project, which involved a total of one million words composed of representative extracts of written and spoken New Zealand English. This paper describes some of the methodological problems encountered in collecting material for a spoken corpus of New Zealand English, including the issue of who counts as a speaker of New Zealand English, the problems of collecting data in particular categories, and the procedures put in place to process collected data. The idea of collecting a Corpus of New Zealand English had been discussed by New Zealand linguists since the mid-1980s. A number of New Zealand linguists had been using corpora in their research into vocabulary (Kennedy, 1991; Bauer and Nation, 1993), and the expression of speech functions such as quantity (Kennedy, 1987), causation (Kennedy and Fang, 1992) and certainty (Holmes, 1982, 1983). They were very aware of the valuable resources which had been made available by the Brown Corpus of American English in the early 1960s, the LOB Corpus of British written English in 1987, and the LUND Corpus of British spoken English in 1980. In 1987, after much debate about design and methodology, linguists at Victoria University began collecting data for the Wellington Corpus of Written and Spoken New Zealand English. Hence, when Sidney Greenbaum proposed that an International Corpus of English should be gathered (1988), it seemed sensible to ensure that New Zealand linguists also collected material suitable for inclusion in that corpus.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.