Abstract

This paper provides an overview of the Corpus of History English Texts, one of the component parts of the Coruna Corpus of English Scientific Writing (Moskowich and Crespo 2012), looking in particular at the communicative formats that it contains. Among the defining characteristics of the Coruna Corpus are that it is diachronic in nature, and that it can be considered either as a single- or multi-genre corpus, according to the theoretical tenets adopted (Kyto 2010; McEnery and Hardie 2013). The corpus has been designed as a tool for the study of language change in English scientific writing in general, and more specifically in the different scientific disciplines which have been sampled in each subcorpus. All the texts compiled were published between 1700 and 1900, thus offering a thorough view of late Modern English scientific discourse, a period often neglected in English historical studies (De Smet 2005). The analysis of this variety of English is also useful as a means of achieving a clear and detailed description of the origins of English as “the language of science”.

Highlights

  • This paper offers a description of the Corpus of History English Texts ( CHET), focusing mainly on the external factors of the compiled texts, such as sex, age and geographical provenance of authors, and genre/text-type

  • The paper is divided into four main sections, the first of which will present the history of the Coruña Corpus ( CC), the core project within which CHET is found

  • Section two will focus on the description of CHET itself, paying special attention to those extra-linguistic factors which are peculiar to it, each one dealt with in its own subsection

Read more

Summary

Introduction

This paper offers a description of the Corpus of History English Texts ( CHET), focusing mainly on the external factors of the compiled texts, such as sex, age and geographical provenance of authors, and genre/text-type. The Coruña Corpus and its family history The CC project was initiated in 2003 with the intention of facilitating linguistic research into eighteenth- and nineteenth-century scientific texts at all levels The novelty it offers is the possibility of using these texts for socio-historical as well as linguistic research, this achieved through the inclusion of metadata files containing personal details about the authors of each sample (age, sex, place of education) and about the works (date of publication, genre/text-type) from which the samples have been extracted (Crespo and Moskowich 2010; Moskowich 2012). The samples are of ca. 10,000 words each, as is the case in the CC as a whole, with a similar number of samples and words for both centuries, as set out in table 1 below:

Century Words
England Ireland Massachussets New York Nova Scotia Scotland Unknown
Number of authors
CHET CEPhiT
Communicative formats in the Hard Sciences
Findings
Treatise Essay Travelogue Lecture Textbook Article Dictionary Biography
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call