Abstract

Comparison of languages and linguistic data is essential if progress in our understanding of the nature of spoken languages is to be made. We understand phenomena better through comparison and contrast. This paper discusses problems that arise in trying to transfer a spoken language corpus transcribed and formatted according to one standard into the standard and format of another corpus. The problems that arise are related both to the differences that exist between the standards of the corpora and to human errors leading to lack of reliability in creating the transcriptions. Although the discussion is based on transfer and transliteration between two specific corpora (the Danish BySoc, BySociolingvistisk Korpus, and the Swedish GSLC, Göteborg Spoken Language Corpus), we believe that the discussion in the article documents and highlights problems of a general kind which have to be faced whenever spoken language corpora of different formats are to be compared.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.