Abstract
This paper investigates multi-word strings automatically retrieved from a 5-million-word corpus of conversational English from Britain and Ireland. Many such strings have neither syntactic nor semantic integrity, for example at the, it was a, what do you. However, many strings display pragmatic integrity, encoding interactive functions such as hedging, vagueness, discourse marking, etc. Examples include and that sort of thing, you know, a couple of. We identify the most common pragmatically integrated clusters and discuss their functions, and compare their frequency with single words, illustrating that many clusters are more frequent than single words accepted as belonging to the core vocabulary of English. The clusters also contrast with the low frequency of opaque idiomatic expressions. High-frequency clusters raise issues around the distinction between lexis and grammar, and support a synthetic view of language production and storage, with implications for the understanding of notions such as fluency and idiomaticity.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have