Abstract

Abstract This article presents the Brazilian Portuguese-Russian (BraPoRus) corpus, whose goal is to collect, analyze, and preserve for posterity the spoken heritage Russian still used today in Brazil by approximately 1,500 elderly bilingual heritage Russian–Brazilian Portuguese speakers. Their unique 100-year-old variety of moribund Russian is disappearing because it has not been passed to their descendants born in Brazil. During the COVID-19 pandemic, we remotely collected 170 h of speech samples in heritage Russian from 26 participants (M age = 75.7 years) in naturalistic settings using Zoom or a phone call. To estimate the quality of collected data, we focus on two methodological challenges, automatic transcription and acoustic quality of remote recordings. First, we find that among commercially available transcription programs, Sonix far outperforms Google Transcribe and Vocalmatic on the measure of word error rate (WER). Second, we also establish that the acoustic quality of the remote recordings was adequate for intonational and speech rate analysis. Moreover, this remote method of collecting and analyzing speech samples works successfully with elderly bilingual participants who speak a heritage language different from their dominant societal language, and it can become a new norm when face-to-face communication with elderly participants is not possible.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.