Abstract

Given the importance of specialized vocabulary in scientific communication and academic discourse, there is a growing need to create wordlists to address the vocabulary-learning needs of university students and researchers in different subject areas. The current study analyzed a corpus of chemistry research articles (with 278 million running words) to establish a mid-frequency vocabulary list for this field. Using frequency, range, and dispersion criteria, the study identified 560 lemmas in the fourth to the ninth British National Corpus/Corpus of Contemporary American English (BNC/COCA) lists that provided 6.4% coverage of all words in the corpus. The list was validated using specialized and general corpora, and the results confirmed the value and relevance of the items for chemistry. Moreover, for using the list for pedagogical goals, the vocabulary items were divided into five bands based on their coverage and importance. The 100 words in the first band were the most important mid-frequent vocabulary in chemistry, as they provided 3.05% coverage. The study highlights the significant contribution of mid-frequency words in research articles and the findings have implications for using large corpora as a big data source in identifying specialized and field-specific vocabulary.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call