The Spoken BNC2014

Robbie Love,Claire Dembry,Tony Mcenery,Vaclav Brezina,Andrew Hardie

doi:10.1075/ijcl.22.3.02lov

The Spoken BNC2014

Robbie Love, Claire Dembry + Show 3 more

Open Access

https://doi.org/10.1075/ijcl.22.3.02lov

Copy DOI

Abstract

Abstract This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers of British English from across the UK, recorded in the years 2012–2016. After showing that a survey of the recent history of corpora of spoken British English justifies the compilation of this new corpus, we describe the main stages of the Spoken BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, and annotation. In doing so we aim to (i) encourage users of the corpus to approach the data with sensitivity to the many methodological issues we identified and attempted to overcome while compiling the Spoken BNC2014, and (ii) inform (future) compilers of spoken corpora of the innovations we implemented to attempt to make the construction of corpora representing spontaneous speech in informal contexts more tractable, both logistically and practically, than in the past.

Highlights

The ESRC Centre for Corpus Approaches to Social Science (CASS) 1 at Lancaster University and Cambridge University Press have compiled a new, publiclyaccessible corpus of present-day spoken British English, gathered in informal contexts, known as the Spoken British National Corpus 2014 (Spoken BNC2014)
The need for a new corpus of conversational British English to allow researchers to continue the kinds of research that the Spoken BNC1994 has fostered over the past two decades. This new corpus will make it possible to turn the ageing of the Spoken BNC1994 into an advantage – if it can be compared to a comparable contemporary corpus, it could become a useful resource for exploring recent change in spoken English
We have presented a general overview of the design and compilation process of the Spoken BNC2014

Summary

Introduction

The ESRC Centre for Corpus Approaches to Social Science (CASS) 1 at Lancaster University and Cambridge University Press have compiled a new, publiclyaccessible corpus of present-day spoken British English, gathered in informal contexts, known as the Spoken British National Corpus 2014 (Spoken BNC2014). This design necessarily represents a compromise between the ideally representative corpus and the constraints of what is realistically possible.

Similar existing corpora – why do we need a new one?

The Spoken British National Corpus 1994

Other British English corpora containing spoken conversational data

Justification for the Spoken BNC2014

Corpus design and data collection

Opportunistic data collection

Recruitment of participants and audio recording

Metadata categories in the Spoken BNC2014

Higher professional occupations

Transcribing the Spoken BNC2014

Developing the transcription scheme

Speaker identification

Converting the transcripts

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Corpus Linguistics	Publication Date: Nov 23, 2017
Citations: 173	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

The Spoken BNC2014

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Corpus Linguistics

Lead the way for us

Similar Papers

Shhh… I Need Quiet! Children's Understanding of American, British, and Japanese-accented English Speakers.
Tessa Bent ... Rachael Frush Holt
Language and Speech | VOL. 61
Tessa Bent, et. al.Tessa Bent ... Rachael Frush Holt
05 Feb 2018
Language and Speech | VOL. 61

The written British National Corpus 2014:design, compilation and analysis

-

01 Jan 2019
The written British National Corpus 2014:design, compilation and analysis

Tag Questions in British and American English
Gunnel Tottie ... Sebastian Hoffmann
Journal of English Linguistics | VOL. 34
Gunnel Tottie, et. al.Gunnel Tottie ... Sebastian Hoffmann
01 Dec 2006
Journal of English Linguistics | VOL. 34

Modality in Philippine English
Peter Collins ... Xinyue Yao
Journal of English Linguistics | VOL. 42
Peter Collins, et. al.Peter Collins ... Xinyue Yao
09 Jan 2014
Journal of English Linguistics | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Spoken BNC2014

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Corpus Linguistics