Two sepedi-english code-switched speech corpora

Thipe I Modipa,Marelie H Davel

doi:10.1007/s10579-022-09592-6

Abstract

We report on the development of two reference corpora for the analysis of Sepedi-English code-switched speech in the context of automatic speech recognition. For the first corpus, possible English events were obtained from an existing corpus of transcribed Sepedi-English speech. The second corpus is based on the analysis of radio broadcasts: actual instances of code switching were transcribed and reproduced by a number of native Sepedi speakers. We describe the process to develop and verify both corpora and perform an initial analysis of the newly produced data sets. We find that, in naturally occurring speech, the frequency of code switching is unexpectedly high for this language pair, and that the continuum of code switching (from unmodified embedded words to loanwords absorbed into the matrix language) makes this a particularly challenging task for speech recognition systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Two sepedi-english code-switched speech corpora

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation

Lead the way for us

Journal: Language Resources and Evaluation	Publication Date: Jun 6, 2022
Citations: 1

Similar Papers

Unusual patterns of codeswitching in an unbalanced bilingual person with aphasia: effects of language and executive functions impairment
Alina Bihovsky ... Natalia Meir
Aphasiology | VOL. ahead-of-print
Alina Bihovsky, et. al.Alina Bihovsky ... Natalia Meir
23 Sep 2024
Aphasiology | VOL. ahead-of-print

Teacher Code Switching Consistency and Precision in a Multilingual Mathematics Classroom
Clemence Chikiwa ... Marc Schäfer
African Journal of Research in Mathematics, Science and Technology Education | VOL. 20
Clemence Chikiwa, et. al.Clemence Chikiwa ... Marc Schäfer
01 Sep 2016
African Journal of Research in Mathematics, Science and Technology Education | VOL. 20

MECOS: A bilingual Manipuri–English spontaneous code-switching speech corpus for automatic speech recognition
Naorem Karline Singh ... Hoomexsun Pangsatabam
Computer Speech & Language | VOL. 87
Naorem Karline Singh, et. al.Naorem Karline Singh ... Hoomexsun Pangsatabam
20 Feb 2024
Computer Speech & Language | VOL. 87

CODE SWITCHING IN EFL CLASSROOM: A NARRATIVE INQUIRY INTO TEACHERS’ EXPERIENCES AND PERCEPTIONS
Nirwanto Maruf ... Indiarti Yasmin Nabillah
TELL-US JOURNAL | VOL. 9
Nirwanto Maruf, et. al.Nirwanto Maruf ... Indiarti Yasmin Nabillah
30 Jun 2023
TELL-US JOURNAL | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Two sepedi-english code-switched speech corpora

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation