Abstract

Corpus linguistics has contributed to lexicography in a number of ways (cf. Hanks, 2009). However, it is probably in the lexicographical treatment of phraseology that corpus linguistics has had the most revolutionizing effect. Evidence of word use in corpora has shown to an unprecedented extent that words are not isolates but rather combine with each other in preferred syntagmatic patterns (e.g. Biber and Conrad, 1999; Hanks, forthcoming). Corpus query systems are now highly sophisticated and incorporate cutting-edge tools and statistics to extract word combinations (Kilgarriff and Kosem, forthcoming). The impact of corpora in the area of phraseology, however, differs significantly across dictionaries both in terms of coverage and access. Collocations have been the subject of particular attention, most particularly in monolingual learners’ dictionaries (MLDs). By contrast, a whole range of recurrent phrases with essential discourse or pragmatic function still need to find their place in dictionaries. As regards access, techniques range from highlighting a restricted number of word sequences in examples to providing lists of salient word combinations in collocation boxes. The aim of the chapter is to provide a critical overview of current developments in corpus-based lexicography and, more particularly, in the lexicographical treatment of phraseology. The chapter starts with a brief overview of the many contributions of corpus linguistics to lexicography before zooming in on phraseology. First, corpus linguistic tools and methods that lexicographers can use to identify phraseological units are described. Second, results from studies that have compared the phraseology of words as evidenced in corpora and as found in dictionaries (e.g. Moon 2008; Walker 2009) are used to assess the impact of corpora on the lexicographical treatment of phraseology in native-speaker dictionaries, learners’ dictionaries, bilingual dictionaries and specialised dictionaries. Current limitations of corpus-based lexicography are exposed (mostly related to corpus design), and a case is made for the use of a wider range of specialised corpora (e.g. genre or domain-specific) in dictionary making. The second section illustrates how specialised corpora can inform the lexicographical treatment of phraseology. It reports on a study that makes use of the notions of ‘precision’ and ‘recall’ (Salton, 1989) to investigate the usefulness of phraseological information in electronic MLDs for academic writing, a genre that is of particular importance for non-native students and researchers in academic settings (Paquot, 2011). Using the Sketch Engine, I conducted a co-occurrence analysis of several verbs in the 90-million word Corpus of Academic Journal Articles (Kosem, 2010). The results were compared with the collocations listed for these verbs in five MLDs (CALD, CCAD, LDOCE, MEDAL and OALD) to assess the coverage of academic collocations. Findings indicate that, although they certainly represent the best attempts at incorporating corpus-based descriptions of language, electronic MLDs could do much better. A number of word combinations that have essential discourse functions in academic writing are missing from the ‘Big Five’. In addition, their undifferentiated treatment of phraseology could lead non-native writers to believe that all collocations and phrases are good for all purposes (e.g. writing a research article or an informal letter). The chapter ends with suggestions for further and better integration of corpus-data (and corpus-query tools) to improve the coverage of and access to phraseology in dictionaries.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call