Abstract

The Corpus of Historical American English (COHA) contains 400 million words in more than 100,000 texts which date from the 1810s to the 2000s. The corpus contains texts from fiction, popular magazines, newspapers and non-fiction books, and is balanced by genre from decade to decade. It has been carefully lemmatised and tagged for part-of-speech, and uses the same architecture as the Corpus of Contemporary American English (COCA), BYU-BNC, the TIME Corpus and other corpora. COHA allows for a wide range of research on changes in lexis, morphology, syntax, semantics, and American culture and society (as viewed through language change), in ways that are probably not possible with any text archive (e.g., Google Books) or any other corpus of historical American English.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call